Williams College CS334 - Programming Languages

CS 334
Programming Languages
Spring 2000

Lecture 2

Commands vs. Expressions

Characteristics of commands and imperative languages in general:

Support for variables - represent memory locations for storing updatable values.
Assignment operation - progress in computation depends on changes in values stored in variables.
Repetition - flow of control guided by conditional and looping statement controlling order in which assignment statements are executed.

Imperative languages are organized around notion of statements.

Meaning of a statement is operation which, based on current contents of memory, and explicit values supplied to it, modifies the current contents of memory.

How are results of one command communicated to the next? Via changes to values in memory.

Problems

Too low level and architecture dependent.

Characteristics of expressions

Expressions return a value, depending on the state of the computation

Examples:

Literals: 3, true, "hello", 42.56
Aggregates: arrays, records, sets, lists, etc. E.g. {1,3,5}
Function calls: F(a,b), a + b * (c - d), (if x > 0 then sin else cos)(_)
Conditional expressions: if x <> 0 then a/x else 1, case (only in functional languages)
Named constants and variables: pi, x

Expressions (at least in math) better behaved than commands.

Meaning of a (pure) expression is operation which, based on current contents of memory, and explicit values supplied to it, returns a value.

Referential transparency

System is referentially transparent if, in a fixed context, the meaning of the whole can be determined solely by the meaning of its parts.

Independent of the surrounding expression.

Therefore once have evaluated an expression in a particular context, never have to evaluate it again in that context since value won't change.

Math. expressions are referentially transparent.

Ex. To evaluate "(2ax + b) (2ax +c)" in a context in which a = 3, b = 4, c = 7, and x = 2, sufficent to evaluate "2ax" only once.

Can determine meaning of f(g(x)) by only knowing the value of f, g, and x (independently).

Moreover if meaning of g' is same as g, then f(g(x)) = f(g'(x)).

(Note importance of replacing construct by equivalent one in compiler optimizations)

Lose referential transparency if allow functions with side effects.

I.e. suppose call to f(x) results in incrementing x by 1.

Then f(x) + f(x) != 2 * f(x).

Program supporting referential transparency much easier to prove correct since only need be concerned about meaning of components and then put them together.

With imperative languages, lose referential transparency.

x := x + y; y := 2 * x; versus

y := 2 * x; x := x + 
y;

Since each command changes underlying state of computation and evaluation depends on state, ordering is critical.

Also correctness of program depends on contents of all memory cells.

Even when try to isolate portions of computations into procedures, can have non-local effects because of use of non-local variables and reference parameters.

Issues with expressions

Order of evaluation
e.g. short-cut evaluations of boolean expressions.
If i > 0 and A[i] <> 99 then ....
What happens if A : ARRAY [1..100] OF INTEGER andi = 0 ?
Pascal vs. Modula-2 conventions.
Side-effects - destroy referential transparency.

Some language conflate (identify) expressions and commands (ALGOL 68 and C).

Often artificial and results in loss of advantages of expressions (e.g., referential transparency).

Ex: x = (y = x+1) + y + (x++)

Compare 2*(x++) and (x++) + (x++)

We will restrict our attention (for the most part) to functional languages with pure expressions.

Try to eliminate problems of commands and take advantage of referential transparency.

Promote reasoning about programs & implementation on parallel computers.

Idea - Program is simply application of a function to data.

No notion of memory or assignment - like a mathematical function - No side effects.

Very rich expressions - virtually all expressions first-class (unlike most imperative languages) in particular, functions are first class objects.

History of functional languages: LISP, Scheme, FP, ML, Haskell, Miranda, Id

Gödel's general recursive functions (developed further by Kleene) (§10.6) and Church and Kleene's lambda calculus (§10.7) used as foundations for computable functions (before Turing machines). All found to be equivalent, leading to Church's thesis.

John McCarthy (then at MIT) in 1958-60 introduced a functional language (LISP), originally in study of symbolic differentiation with linked lists. Key article published in 1960 showing examples of important programs could be expressed as pure functions operating on lists. (LISP since been revised into competing dialects - Common LISP and Scheme.)

Functional languages or notation used in describing denotational semantics of programming languages starting in 1960's.

Most stunning event was Backus' Turing award lecture in 1978.
Proposed language FP (since replaced by FL) supporting "functional" style of programming.

First ML compiler was put out in 1977 (originally in support of interactive theorem proving system - text Edinburgh LCF by Gordon, Milner, and Wadsworth published). (Milner just won Turing award.) Standardized in about 1986.

Other important languages include SASL, KRC, and Miranda (all by David Turner). Haskell is successor. All support lazy evaluation.

Currently 3 main schools of functional languages:

LISP/Scheme
Strict functional (eager evaluation) (ML, Hope)
Lazy languages (Miranda, Haskell)

First two classes of languages support imperative features (though much more controlled in ML).

First uses dynamic typing, other two support static typing w/ polymorphic functions and type inference.

We choose ML for somewhat arbitrary reasons. Heavily used to develop real software, supports modern programming constructs.

The point of this part of the course is NOT to teach you ML, it is to teach familiarity with thinking in the functional paradigm with ML as the example language (though talk about others as well). I expect you to mainly learn ML on your own in the lab while I lecture on related material.

ML

Overview of ML

Developed in Edinburgh in late 1970's as Meta-Language for automated theorem proving system.

Success led to adoption and strengthening as programming language.

Important attributes:

Primarily applicative
Functions are first class values
Statically scoped
Static typing via type inference
Polymorphic types
Rich type system including support for ADT's.
Support for imperative features.
Support for exception handling
Automatic storage management via garbage collection
Incremental compiler supporting interactive program development.

How to use the run-time system.

To launch ML type:

sml

System responds with message saying in ML, and then "-" prompt.

Can load definitions from UNIX file by typing:

   use "myfile.sml";

where myfile.sml is the name of your file. It should be in the same directory you were in when you typed sml.

Terminate session by typing control-D.

Evaluate expression by typing in and following with ";", e.g.

   - 3 + 5;
   val it = 8 : int

In the previous line (and later exampless), "-" is the prompt to the user, so the rest of the code on that line is what the user types in. The computer's response is shown directly below.

"it" refers to last value computed. Can also bind value to an identifier:

   - val six = 6;
   val six = 6 : int;

Thus typing an expression, exp, is equivalent to typing: val it = exp;

Identifier often called a variable, but really a constant declaration ("val" for value).

Can also define functions.

   - fun succ x = x + 1;
   val succ = fn : int -> int
   - succ 12;
   val it = 13 : int
   - 17 * (succ 3);
   val it = 68 : int;

Can also write:

   - val succ = fn x => x + 1;
   val succ = fn : int -> int

"fun" declaration tells compiler to look for fcn arguments.

Note semi-colon at top-level terminates parsing and causes evaluation.

No loops in the language, all functions written via recursion and if.. then.. else:

   - fun fact n = if n = 0 then 1 else n * fact (n-1);

Data types in ML

Built-in data types

unit, bool, int, real, strings, characters

unit has only one value: ()
bool includes true, false and operators: not, andalso, orelse
int includes positive and negative: ...,~2, ~1,0,1,2...

supports +, -, *, div, mod, =, <, <=, >, >=
real of form 3.17, 2.4E17

with +, -, *, /, <, <=, >, >=, log, exp, sin, arctan.
Real no longer supports "=" because of dangers of round-off error. Instead, test the absolute value of the difference of the numbers to be less than some small tolerance.
string of form "my string" - \t = tab, \n = newline.

supports ^ (concatenation), length, substring where substring("hello",1,3) -> "ell"
char type is new in sml97. Write as # followed by string of length one. Thus #"b" is the character b, while "b" is the string of length one containing only the character b.

overloading

If expression involves an overloaded operator (e.g., + , *, -), and no other clues as to what type the argument or result should be, used to get type-checking error:

   - fun double x = x + x;
   Type checking error in: (syntactic context unknown)
   Unresolvable overloaded identifier: +
   Definition cannot be found for the type: ('a * 'a) -> 'a

In SML97, assumes the argument must be int:

   - fun double x = x+x;
   val double = fn : int -> int

type declarations

Must put in types if want to be other than int function if there are no other clues to type inference.

Can include type info if like.

   - fun succ (x:real) = x + 1.0;

   - fun succ x : real = x + 1.0;

(which tells system that the result of the function is a real) or even

   - fun succ (x:real) :real  = x + 1.0;

though in these cases don't need to because clue of using "1.0" tells compiler that you want real addition!

Type constructors

tuples, records, lists

tuples

(17,"abc", true) : int * string * bool

records

{name = "bob",salary = 50000.99, rank=1}: {name: string, salary:real, rank:int}

Tuples are abbreviations of records where labels are 1,2,3,...
Thus (17,"abc", true) = {1 = 17, 2 = "abc", 3 = true}

Selectors:

#lab : {lab : 'a,...} -> 'a

Thus #rank({name = "bob",salary = 50000.99, rank=1}) = 1
#2((17,"abc", true)) -> "abc"

Ex. of function on tuples:

   - fun power (m,n) = if n = 0 then 1
                                else m * power (m,n-1);
   val power = fn : (int * int) -> int

On the other hand

   - fun cpower m n = if n = 0 then 1
                                else m * cpower m (n-1);
   val cpower = fn : int -> (int -> int)

Note these are different functions!

Latter said to be in "Curried" form (after Haskell Curry).

Can define

   - val twopower = cpower 2
   val twopower = fn : int -> int
   - twopower 3;
   val it = 8 : int

lists

[2,3,4,5,6] - all elts must be of same type.

Operations:

length
@ - append - e.g.[1,2,3]@[4,5,6] = [1,2,3,4,5,6]
:: - prefix (e.g. 1::x = [1,2,3,4,5,6])
map - apply function to all elements of a list,
e.g. map sqr [1,2,4] = [1,4,16]
rev - reverses list
[], nil - empty list

Many kinds of lists:

int list: [1,2,3]
string list: ["ab","cd","ef"]

nil is part of any list type,

   - nil;
   val it = [] : 'a list

where 'a stands for a type variable. Similarly write:

   - map;
   val it = fn: ('a -> 'b) -> (('a list) -> ('b list))

Map is first example of a polymorphic function.

Lists are built up using ::, can also be decomposed the same way,

i.e.,


[1,2,3] = 1::[2,3] = 1::2::[3] = 1::2::3::nil

Can define functions by cases.

   - fun product [] : int = 1
   =   | product (fst::rest) = fst * (product rest);

Note that "=" is automatically printed on continuation line. Don't include it in your program files!

Can also use integers in patterns:

- fun oneTo 0 = []   
=   | oneTo n = n::(oneTo (n-1));   
     
- fun fact n = product (oneTo n);

Note oneTo 5 = [5,4,3,2,1]

Could have written

   val fact = product o oneTo (* o is fcn. comp. *)

Here is how we could define a reverse fcn if it were not provided:

   - fun reverse [] = []   
   =   | reverse (h::t) = reverse(t)@[h];  (* pattern matching *)

Pattern matching

Pattern matching is quite important in this language.

Rarely use hd or tl - list operators giving head and tail of list.

Note that hd (a::x) = a, tl(a::x) = x, and((hd x) :: (tl x)) = x

if x is a list with at least one element.

Can use pattern matching in relatively complex ways to bind variables:

   - val (x,y) = (5 div 2, 5 mod 2);   
   val x = 2 : int   
   val y = 1 : int   
   
   - val head::tail = [1,2,3];   
   val head = 1 : int   
   val tail = [2,3] : int list   
   
   - val {a = x, b = y} = {b = 3, a = "one"};   
   val x = "one" : string   
   val y = 3 : int   
   
   - val head::_ = [4,5,6];  (* note use of wildcard "_" *)   
   val head = 4 : int

Type inference

Language is strongly typed via type inference - infers type involving type variables if possible.

Thus

   hd : ('a list) -> 'a   
   tl : ('a list) -> ('a list)

Define

   fun last [x] = x   
     | last (fst::snd::rest) = last (snd::rest);

has type 'a list -> 'a, but don't have to declare it!

Restrictions on type inference (including overloading problems)

As noted earlier, type inference does not always interact well with overloading: arith ops, ordering (e.g. "<") - though it's better in sml97 than it was!

Also need to distinguish "equality" types:

   - fun search item [] = false   
   =   | search item (fst::rest) = if item = fst then true   
   =                                      else search item rest;   
   val search = fn : ''a -> ((''a list) -> bool)

Double quote before variable name indicates "equality" type. Cannot use "=" on types which are real or function types or contain real or function types. Also only type variables allowed in equality types are those with ''.

Local declarations (including parallel and sequential declarations).

Functions and values declared at top level (interactively) stay visible until a new definition is given to the identifier.

   - val x = 3 * 3;   
   val x = 9 : int;   
   - 2 * x;   
   val it = 18 : int

Can also give local declarations of function and variables.

   - fun roots (a,b,c) = let val disc = sqrt (b * b - 4.0 * a * c)    
   =                     in   
   =                         ((~b + disc)/(2.0*a),(~b - disc)/(2.0*a))   
   =                     end;   
   - roots (1.0,5.0,6.0);   
   (~2.0,~3.0) : real * real   
   - disc;   
   Type checking error in: disc   
   Unbound value identifier: disc

Scoping

ML uses static scoping (unlike original LISP)

   - val x = 3;   
   val x = 3 : int   
   - fun f y = x + y;   
   val f = fn : int -> int   
   - val x = 6;   
   val x = 6 : int   
   - f 0;

What is answer?

3!!

Why? Because definition of f used first "x", not second.

ML employs "eager" or call-by-value parameter passing

Talk later about "lazy" or "call-by-need".

Declarations and Order of operations:

Evaluate operand, substitute operand value in for formal parameter, and evaluate.
Inside record, evaluate fields left to right.
Inside let expression of form "let decl in exp end": evaluate decl producing new environment, evaluate exp in new environment, restore old environment, return value of exp.

Can have sequential or parallel declarations:

   - val x = 12   
   = val y = x +2;   
   val x = 12 : int   
   val y = 14 : int   
   - val x = 2   
   = and y = x + 3;   
   val x = 2 : int   
   val y = 15 : int

However, when defining functions, simultaneous declaration supports mutual recursion.

   - fun f n = if n = 0 then 1 else g n   
   = and g m = m * f(m-1);

Back to:

CS 334 home page

Kim Bruce's home page

CS Department home page

kim@cs.williams.edu

CS 334 Programming Languages Spring 2000 Lecture 2