CS 334
Programming Languages
Spring 2000

Lecture 8

Disjoint Union:

Variant record - type1 union type2 w/discriminant

Support alternatives w/in type:

Ex.

        RECORD
           name : string;
           CASE status : (student, faculty) OF
              student: gpa : real;
                       class : INTEGER;
           |  faculty: rank : (Assis, Assoc, Prof);
           END;
        END;

Save space yet (hopefully) provide type security. Saves space because the amount of space reserved for a variable of this type is the larger of the variants.

Fails in Pascal / MODULA-2 since variants not protected.

How is this supported in ML?

datatype IntReal = INTEGER of int | REAL of real;

Can think of enumerated types as variant w/ only tags!

NOTICE: Type safe. Clu and Ada also support type-safe case for variants:

Ada: Variants - declared as parameterized records:

type geometric (Kind: (Triangle, Square) := Square) is
    record
       color : ColorType := Red ;
       case Kind of
          when Triangle =>
                 pt1,pt2,pt3:Point;
          when Square =>
                 upperleft : Point;
                 length : INTEGER range 1..100;
       end case;
    end record;

ob1 : geometric -- default is Square
ob2 : geometric(Triangle) -- frozen, can't be changed

Avoids Pascal's problems w/holes in typing.

Illegal to change "discriminant" alone.

ob1 := ob2   -- OK
ob2 := ob1   -- generate run-time check to ensure Triangle

If want to change discriminant, must assign values to all components of record:

ob1 := (Color=>Red,Kind=>Triangle,pt1=>a,pt2=>b,pt3=>c);

If write code

    ... ob1.length...

then converted to run-time check:

    if ob1.Kind = Square then ... ob1.length ....
                         else raise constraint_error
    end if.

Fixes type insecurity of Pascal

Note disjoint union is not same as set-theoretic union, since have tags.

    IntReal = {INTEGER} x int + {REAL} x real

C supports undiscriminated unions:

    typedef union {int i; float r;} utype.

As usual with C, it is presumed that the programmer knows what he/she is doing and no static or run-time checking is performed.

Mappings:

Encompasses functions w/ both infinite and finite domains.

Arrays:

homogeneous collection of data.

Mapping from index type to range type
E.g. Array [1..10] of Real corresponds to {1,...,10} -> Real

Operations and relations: selection "^.[^.]", :=, =, and occasionally slices.

E.g. A[2..6] represents an array composed of A[2] to A[6]

Index range and location where array stored can be bound at compile time, unit activation, or any time.

static: FORTRAN
semi-static: Pascal,
(semi-)dynamic: ALGOL 60, Ada
flexible: Algol 68 & Clu

In both static and semi-static languages the index set of an array is bound at compile time. The difference is that with static arrays, the location of the array in memory is bound at compile time (as in FORTRAN), while with semi-static, the size of the array is bound at compile time, but its location is determined at run-time.

For instance, in Pascal, an array stored in a local variable is allocated on the run-time stack, and its location on the stack may vary in different invocations of the procedure.

With semi-dynamic (or dynamic) arrays, the index set (and hence size) of the array may vary at run-time. For instance in ALGOL 60 or Ada, an array held in a local variables may have index bounds determined by a parameter to the routine. It is called semi-dynamic because the size is fixed once the routine has been activated.

A flexible array is one whose size can change at any time during the execution of a program. Thus, while a particular size array may be allocated when a procedure is invoked, the array may be expanded in the middle of a loop if more space is needed.

The key to these differences is binding time, as usual!

Function abstractions:

S->T ... function f(s:S):T (where S could be n-tuple)

What if S were a record instead of an n-tuple?

Operations: abstraction and application, sometimes composition.

What is difference from an array? Efficiency, esp. w/update.

	update f arg result x = if x = arg then result else f x

	update f arg result = fn x => if x = arg then result else f x

Procedure can be treated as having type S -> unit for uniformity.

Powerset:

	set of elt_type;

Typically implemented as bitset or linked list of elts

Operations and relations: All typical set ops, :=, =, subset, .. in ..

Why need base set to be primitive type? What if base set records?

Recursive types:

Examples:

  	tree = Empty | Mktree of int * tree * tree
	list = Nil | Cons of int * list

In most lang's built by programmer from pointer types.

Sometimes supported by language (e.g. Miranda, Haskell, ML).

Why can't we have direct recursive types in ordinary imperative languages?

OK if use ref's:

	list = POINTER TO RECORD
			first:integer;
			rest: list
		END;

Recursive types may have many sol'ns

E.g. list = {Nil} union (int x list) has following sol'ns:

finite sequences of integers followed by Nil: e.g., (2,(5,Nil))
finite or infinite sequences, where if finite then end with Nil

Similarly with trees, etc.

Theoretical result: Recursive equations always have a least solution - though infinite set if real recursion.

Can get via finite approximation. I.e.,

   list₀ = {Nil}
   list₁ = {Nil} union (int x list₀) 
         = {Nil} union {(n, Nil) | n in int}

   list₂ = {Nil} union (int x list₁) 
         = {Nil} union {(n, Nil) | n in  int}
                 union {(m,(n, Nil)) | m, n in int}

      ...

   list = Union_n list_n

Very much like unwinding definition of recursive function

	fact = fun n => if n = 0 then 1 else n * fact (n-1)
	
	fact₀ = fun n => if n = 0 then 1 else undef
	
	fact₁ = fun n => if n = 0 then 1 else n * fact₀(n-1)
	      = fun n => if n = 0, 1 then 1 else undef
	      
	fact₂ = fun n => if n = 0 then 1 else n * fact₁(n-1)
	      = fun n => if n = 0, 1 then 1 else 
	                 if n = 2 then 2 else undef
	...


	fact = Union_n fact_n

Notice solution to T = A + (T->T) is inconsistent with classical mathematics!
In spite of that, however, it can be used in Computer Science,

	datatype univ = Base of int | Func of (univ -> univ);

Sequence:

Lists

Supported in most fcnal languages

operations: hd, tail, cons, length, etc.

sequential files

File operations: Erase, reset, read, write, check for end.

Persistent data - files.

strings:

ops: <, length, substr

Are strings primitive or composite?

Composite (arrays) in Pascal, Modula-2, ...
Primitive in ML
Lists in Miranda and Prolog: provides more flexibility (no length bound)

User-Defined Types

User gets to name new types. Why?

more readable
Easy to modify if localized
Factorization - why copy same complex def. over and over (possibly making mistakes)
Added consistency checking in many cases.

STATIC VERSUS DYNAMIC TYPING

Static: Most languages use static binding of types to variables, usually in declarations

	var x : integer  {bound at translation time}

The variable can only hold values of that type. (Pascal/Modula-2/C, etc.)

FORTRAN has implicit declaration using naming conventions

If start with "I" to "N", then integer, otherwise real.

Other languages will "infer" type of undeclared variables.

In either case, run real danger of problems due to typos.

Example in ML, if

	datatype Stack ::= Nil | Push of int;

then define

	fun f Push 7 = ...

What error occurs?

Answer: Push is taken as a parameter name, not a constructor.
Therefore f is given type: A -> int -> B rather than the expected: Stack -> B

Dynamic: Variables typically do not have a declared type. Type of value may vary during run-time. Esp. useful w/ heterogeneous lists, etc. (LISP/SCHEME).

Dynamic more flexible, but more overhead since must check type before performing operations (therefore must store tag w/ value).

Dynamic binding found in APL and LISP.

Type of variable may change during execution.
E.g., may have x := 0 at one point and x := [5,2,3] at some other point, yet x is only declared once.

Dynamic binding harder to implement since can't allocate a fixed amount of space for variables. Therefore often implemented as pointer to memory holding value.

TYPES IN HISTORY OF PROGRAMMING LANGUAGES

FORTRAN

Built-In: Integer, Real, Double Precision, Complex, Logical

no characters or strings, no user-defined of any sort.

Arrays - at most 3-dim'l of built-in type. Subscripts begin at 1

Orig., restricted form of subscript expressions.

No records or sets. Many holes in typing.

ALGOL 60

Built-In: Integer, Real, Boolean, limited strings

Arrays of built-in types - no limit on dim'n, bounds any integers, semi-dynamic arrays

No records or sets. Strongly and statically typed.

	Array [1..10, 'a'..'z'] of Real = Array [1..10] of Array ['a'..'z'] of Real

User fooled into thinking Array[A,B] of C is AxB->C, but really A->B->C.

Any discrete type as index.

No semi-dynamic arrays. Result of 2 principles:

All types must be determinable at compile time.
Array bounds are part of type.

Therefore, must have statically determinable array bounds.

Type of actual parameters must agree w/ type of formals

Therefore, no general sort routines, etc.

The major problem with Pascal

		Procedure x(...; procedure y;...)

			:

			y(a,2);

Fixed in (new) ANSI standard.

No checking if type of file read in matches what was originally written.

2. Problems w/ type compatibility

Assignment compatibility:

When is x := y legal? x : integer, y : 1..10? reverse?

What if type hex = 0..15; ounces = 0..15;

var x : hex; y : ounces;

Is x := y legal?

Original report said both sides must have identical types.

When are types identical?

Ex.:

    Type    T = Array [1..10] of Integer;
    Var  A, B : Array [1..10] of Integer;
             C : Array [1..10] of Integer;
             D : T;
             E : T;

Which variables have the same type?

   T1 = record a : integer; b : real  end; 
   T2 = record c : integer; d : real  end;
   T3 = record b : real; a : integer  end;

Which are the same?

Worse:

   T = record info : integer; next : ^T  end; 
   U = record info : integer; next : ^V  end; 
   V = record info : integer; next : ^U  end;

Ada uses Name Equivalence_A

Pascal & Modula-2 use Name Equivalence for most part. Check!

Modula-3 uses Structural Equivalence

Two types are assignment compatible iff

have equivalent types or
one subrange of other or
both subranges of same base type.

Can overload values:

    Color is (Red, Blue, Green)
    Mood is (Happy, Blue, Mellow)

If ambiguous can qualify w/ type names:

    Color(Blue), Mood(Blue)

Subranges Declared w/range attribute.

i.e., Hex is range 0..15

Other attributes available to modify type definitions:

	Accurate is digits 20
	Money is delta 0.01 range 0.00 .. 1000.00     -- fixed pt!

Can extract type attributes:

	Hex'FIRST -> 1
	Hex'LAST  -> 15

Can initialize variables in declaration:

	declare k : integer := 0

Arrays

"Constrained" - semi-static like Pascal

	type Two_D is array (1..10, 'a'..'z') of Real

or "Unconstrained" (what we called semi-dynamic earlier)

	type Real_Vec is array (INTEGER range <>) of REAL;

Generalization of open array parameters of MODULA-2.

Of course, to use, must specify bounds,

	declare x : Real_Vec (1..10)

or, inside procedure:

   Procedure sort (Y: in out Real_Vec; N: integer) is -- Y is open array parameter
      Temp1 : Real_Vec(1..N);             -- depends on N
      Temp2 : Real_Vec (Y'FIRST..Y'LAST); -- depends on parameter Y
      begin 
         for I in Y'FIRST ..Y'LAST loop
            ...
         end loop;
         ... 
      end sort;

Note Ada also has local blocks (like ALGOL 60)

All unconstrained types (w/ parameters) elaborated at block entry (semi-dynamic)

String type is predefined open array of chars:

	array (POSITIVE range <>) of character;

Can take slice of 1-dim'l array.

E.g., if

    Line : string(1..80)

Then can write

    Line(10..20) := ('a','b',.'c','d','e','f','g','h','i','j')  
                                         -- gives assignment to slice

Because of this structure assignment, can have constant arrays.

Back to:

CS 334 home page

Kim Bruce's home page

CS Department home page

kim@cs.williams.edu

CS 334
Programming Languages
Spring 2000

Lecture 8

Disjoint Union:

Mappings:

Arrays:

Function abstractions:

Powerset:

Recursive types:

Sequence:

Lists

sequential files

strings:

User-Defined Types

STATIC VERSUS DYNAMIC TYPING

TYPES IN HISTORY OF PROGRAMMING LANGUAGES

FORTRAN

ALGOL 60

Pascal

Pascal's Types

Built-In:

Enumeration types.
Subranges

Arrays

Variant records

Pointers

Sets

Sequential files

Problems with Types in Pascal

Name Equivalence_A

Name Equivalence (called declaration equivalence in text)

Structural Equivalence

Ada

Ada's Types

Built-In:

Enumeration types.

Arrays

CS 334 Programming Languages Spring 2000 Lecture 8

CS 334
Programming Languages
Spring 2000

Lecture 8