CS 334 Lecture 9

Adding a run-time environment to interpreter

We have earlier described substitution as a reasonable mechanism for interpreting function application (called beta-conversion), but there are a few places where you must be very careful with name clashes if we have free variables.
(See section 10.7 in the text for details.)

We normally expect that if we change the names of formal parameters that it should not make any difference, but ...

Suppose we evaluate:

   let  fun g x y = x + y  in g y end;

(or in our language PCF:

   (fn g => g y) (fn x => fn y => x+y))

If we evaluate blindly we get:

   fn y => y + y

Notice that because of scoping, the actual parameter y has become captured by the formal parameter y!

We should get: fn w => y + w, which has a very different meaning!!

(Note that we did not run into this problem earlier since during our evaluations we never worked with terms with free variables - when going inside functions we replaced all formal parameters by the actual parameters, which didn't involve free variables).

A different order of evaluation would have brought forth the same problem, however.

We would like to have fn x => B to represent the same function as
fn y => B[x:=y] as long as y doesn't occur freely in B. (Called alpha-conversion)

If you always alpha-convert to a new bound variable before substituting in, will never have problems.

Evaluate terms via environments:

Env = string -> values

An environment, rho, tells value of identifiers in term.

Write [[e]] rho for meaning of e with respect to environment rho.

E.g. if rho(x) = 12 and rho(y) = 2, then [[x+y]] rho = 14.

How does function application result in change of environment?

[[(fn x => body) actual]]rho = [[body]] rho [ [[actual]]rho / x]

where rho[v / x] is environment like rho except x has value "v".

This and rec are the only rules in which the environment changes!

Rest of rules look like the old interpreter (except identifiers looked up in environment)!

Replaces all uses of subst!

This means that computation no longer takes place by rewriting terms into new terms, interp is now a function from term to value.

Note that

	let val x = arg in e

is equivalent to

	(fn x => e) arg

Must worry about scoping problems:

   val test = let 
                  val x = 3;
                  fun f y = x + y;
                  val x = 12
              in 
                  x + (f 7)
              end;

What is value of test?

Change in scope is reflected by change in environment.

With functions must remember environment function was defined in!

When apply function, apply in defining environment.

test is equivalent to

   (fn x => (fn f => ((fn x => x + (f 7)) 12) (fn y => x + y))) 3

Then

 
   [[(fn x => (fn f => ((fn x => x + (f 7)) 12) (fn y => x + y))) 3]] rho0
	= [[(fn f => ((fn x => x + (f 7)) 12) (fn y => x + y)) ]] rho1
	= [[(fn x => x + (f 7)) 12]] rho2
	= [[x + (f 7)]] rho3
	= 12 + ([[fn y => 3 + y]] rho1) 7
	= 12 + [[3 + y]] rho4
	= 12 + 3 + 7 
	= 22

where rho0 is the starting environment and

   rho1 = rho0 [ [[3]] rho0 / x] = rho0[ 3 / x]
   rho2 = rho1 [ [[fn y => x + y]] rho1 / f] 	<-	Closure for f
   rho3 = rho2 [ [[12]] rho2 / x] = rho2[ 12 / x]
   rho4 = rho1 [ 7 / y]

Pascal

Built-In:

Integer, Real, Boolean, Char, no strings except as packed array of char.

Enumeration types. Subranges

Guard against errors, save space. (only for discrete types)

Arrays

Hierarchical, but only one-dim'l.

Array [1..10, 'a'..'z'] of Real = Array [1..10] of Array ['a'..'z'] of Real

User fooled into thinking Array[A,B] of C is AxB->C, but really A->B->C.

Any discrete type as index.

No semi-dynamic arrays. Result of 2 principles:

1. All types must be determinable at compile time.

2. Array bounds are part of type.

Therefore, must have statically determinable array bounds.

3. Type of actual parameters must agree w/ type of formals

Therefore, no general sort routines, etc.

The major problem with Pascal

Variant records

as above - introduce holes in type system.

Pointers

must point to objects of specific type (unlike PL/I)

Sets

supported - but often limited implementation.

Sequential files

of any (non-file) type.

Problems with Types in Pascal

1. Holes in typing system with variant records, procedure parameters, and files.

		Procedure x(...; procedure y;...)

			:

			y(a,2);

Fixed in (new) ANSI standard.

No checking if type of file read in matches what was originally written.

2. Problems w/ type compatibility

Assignment compatibility:

When is x := y legal? x : integer, y : 1..10? reverse?

What if type hex = 0..15; ounces = 0..15;

var x : hex; y : ounces;

Is x := y legal?

Original report said both sides must have identical types.

When are types identical?

Ex.:

    Type    T = Array [1..10] of Integer;
    Var  A, B : Array [1..10] of Integer;
             C : Array [1..10] of Integer;
             D : T;
             E : T;

Which variables have the same type?

Name EquivalenceA

Same type iff have same name --> D, E only

Name Equivalence (called declaration equivalence in text)

Same type iff have same name or declared together

--> A, B and D, E only.

Structural Equivalence

Same type iff have same structure --> all same.

Structural not always easy. Let

  T1 = record a : integer; b : real  end; 
  T2 = record c : integer; d : real  end;
  T3 = record b : real; a : integer  end;

Which are the same?

Worse:

  T = record info : integer; next : ^T  end; 
  U = record info : integer; next : ^V  end; 
  V = record info : integer; next : ^U  end;

Ada uses Name EquivalenceA

Pascal & Modula-2 use Name Equivalence for most part. Check!

Modula-3 uses Structural Equivalence

Two types are assignment compatible iff

have equivalent types or
one subrange of other or
both subranges of same base type.

Ada

Built-In:

Integer, Real, Boolean, Char, strings.

Enumeration types.

Character and boolean are predefined enumeration types.

e.g., type Boolean is (False, True)

Can overload values:

    Color is (Red, Blue, Green)
    Mood is (Happy, Blue, Mellow)

If ambiguous can qualify w/ type names:

    Color(Blue), Mood(Blue)

Subranges

Declared w/range attribute.

i.e., Hex is range 0..15

Other attributes available to modify type definitions:

	Accurate is digits 20
	Money is delta 0.01 range 0.00 .. 1000.00     -- fixed pt!

Can extract type attributes:

	Hex'FIRST -> 1
	Hex'LAST  -> 15

Can initialize variables in declaration:

	declare k : integer := 0

Arrays "Constrained" - semi-static like Pascal

	type Two_D is array (1..10, 'a'..'z') of Real

or "Unconstrained" (what we called semi-dynamic earlier)

	type Real_Vec is array (INTEGER range <>) of REAL;

Generalization of open array parameters of MODULA-2.

Of course, to use, must specify bounds,

	declare x : Real_Vec (1..10)

or, inside procedure:

   Procedure sort (Y: in out Real_Vec; N: integer) is -- Y is open array parameter
      Temp1 : Real_Vec(1..N);             -- depends on N
      Temp2 : Real_Vec (Y'FIRST..Y'LAST); -- depends on parameter Y
      begin 
         for I in Y'FIRST ..Y'LAST loop
            ...
         end loop;
         ... 
      end sort;

Note Ada also has local blocks (like ALGOL 60)

All unconstrained types (w/ parameters) elaborated at block entry (semi-dynamic)

String type is predefined open array of chars:

	array (POSITIVE range <>) of character;

Can take slice of 1-dim'l array.

E.g., if

    Line : string(1..80)

Then can write

    Line(10..20) := ('a','b',.'c','d','e','f','g','h','i','j')  
                                         -- gives assignment to slice

Because of this structure assignment, can have constant arrays.

Ada Subtypes and derived types:

Types have static properties - checked at compile time

and dynamic properties - checked at run time

Example of dynamic are range, subscript, etc.

Specify dynamic properties by defining subtype. E.g.,

   subtype digit is integer range 0..9;

Subtypes also constrain parameterized array or variant record.

	subtype short_vec is Real_Vec(1..3);
	subtype square_type is geometric (square)

Subtypes do not define new type, add dynamic constraints.

Therefore can mix different subtypes of same type w/ no problems

Derived types define new types:

	type Hex is new integer 0..15
	type Ounces is new integer 0..15

Now Hex, Ounces, and Integer are incompatible types: treated as distinct copies of 0..15

Can convert from one to other:

	Hex(I), Integer(H), Hex(Integer(G))

Derived types inherit operators and literals from parent type.

	E.g., Hex gets 0,1,2,... +,-,*,...

Use for private (opaque) types and when don't want mixing.

Compare Ada's solutions w/ Pascal's problems:

Helped by removing dynamic features from def of type subrange or index of array.

Can now have open array parameters (also introduced in ISO Pascal).

Variants fixed

Name equivalence in Ada to prevent mixing of different types. E.g., can't add Hex and Ounce.

Can define overloaded multiplication such that if

	l:Length;
	w:Width;

then l * w : Area.

Type completeness principle:

No operation should be arbitrarily restricted in the types of the values involved.

Avoid second-class types.

Ex. in Pascal: Restrictions on return values of functions, lack of procedure variables, etc.

ML comes much closer to satisfying.

Summary of types so far:

postpone ADT's until later

Modern tendency to strengthen static typing and avoid implicit holes in types system.

- usually explicit (dangerous ) means for bypassing types system, if desired

Try to push as many errors to compile time as possible by:

Requiring overspecification through typing
Distinguishing btn diff. uses of same types (name equiv.)
Mandating constructs designed to eliminate typing holes
Minimizing or eliminating use of explicit pointers (esp. user-controlled deallocation of ptrs).

Problem: loss of flexibility which obtainable from dynamic typing or lack of any typing.

Important direction of current research in computer science:

Provide type safety, but increase flexibility.

Important progress over last 20 years:

Polymorphism, ADT's, Subtyping & other aspects of object-oriented languages.

STORAGE

What are storable values of language? Those that cannot be selectively updated.

Varies between languages.

Pascal: primitive (integer, real, char, boolean), sets, pointers

ML: primitive, records, tuples, lists, function abstractions, ref's to vbles.

Examine how variables allocated and lifetime.

Program Units:

Separate segments of code - usually allow separate declaration of local variables.

E.g. Procedures, functions, and blocks (from ALGOL 60 & C, like parameterless procedures located in-line.)

Program unit represented during execution by unit instance, composed of code segment and activation record (gives info on parameters and local variables, and where to return after execution).

Activation Record Structure:

Return address

Access info on parameters

Space for local variables

Units often need access to non-local variables.

How is procedure call made?

To call:

1. Make parameters available to callee.

2. Save state of caller (register, prog. counter).

3. Make sure callee knows how to find where to return to.

4. Enter callee at 1st instruction.

To return:

1. Get return address and transfer execution to that point.

2. Caller restores state.

3. If fcn, make sure result value left in accessible location (register, on top of stack, etc.)

Memory allocation

Three types of languages:

Static: E.g. FORTRAN and COBOL.

Stack-Based: E.g. ALGOL-like languages (including Pascal and C).

Dynamic: LISP, PROLOG, APL, ML, Miranda, Eiffel, etc. as well as aspects of Pascal, Ada, etc.

Static using FORTRAN as example

Units: Main program, Subroutines, and Functions.

All storage (local and global) known at translation time (hence static).

Activation records can be associated with each code segment.

Structure:

Return address

Access info on parameters

Space for local variables

At compile time, both instructions and vbles can be accessed by

(unit name, offset)

At link time can resolve to absolute addresses.

Global info shared via common statement:

COMMON/NAME1/A,B,S(25)

Statement must occur in all units wishing to share information. Name of the block must be identical, though can give different names to variables. (Gives rise to holes in typing) Identifiers are matched in order w/ no checking of types across unit boundaries.

Space for all common blocks allocated and available globally.

Procedure call and return straightforward

CS 334 Lecture 9

Contents: