Assignment 3 Due 3/11/97

- Suppose you are given the following grammar for simple English
sentences:
<sentence> -> <noun-phrase> <verb-phrase> '.'

a. How would you modify the grammar to introduce a new start symbol <paragraph> that will allow you to generate any number of sentences?<verb-phrase> -> <verb> <noun-phrase>

<verb> -> sees | likes | grabs

<noun-phrase> -> <article> <noun>

<article> -> a | the

<noun> -> girl | dog

b. Suppose you wish to be able to generate sentences with all possible pronouns as subjects and objects (e.g., you, he, she, it, they). What problems do you encounter in generating grammatically correct English? How can you solve these problems? Provide a modification of the grammar above which does reasonably well at supporting pronouns.

- Two famous phrases used as examples in linguistics are "Time flies like an
arrow" and "Fruit flies like a banana." (I believe they are due to Noam
Chomsky). Please generate plausible syntax rules for English that would allow
you to parse both of these sentences. The point of the examples is to indicate
the difficulty in parsing (and understanding) natural languages. Please
explain this difficulty.
- Please do problem 8 on page 93 of Louden. (You might want to look at
the solution to the similar problem 9 in the back of the text for a hint.)
- Suppose we have a language which uses dynamic scope. Ignoring pointers,
is it the case that the scope of a variable is the same as its lifetime?
Why or why not?
- Please do problem 9 on page 146 of Louden.
- In this problem, I want you to begin to design an ML interpreter for
a simple functional language. Our language is relatively simple, but more
sophisticated than the arithmetic expressions of last week since it involves
functions. The expressions are written in the
language given by the following simple BNF grammar.
e ::= x | n | true | false | succ | pred | iszero | if e then e else e | (fn x => e) | (e e) | rec x => e

In the above, "x" is a variable, "n" stands for an integer, "true" and "false" are the truth values, "succ" and "pred" are unary functions which either add or subtract 1 from its arguments, "iszero" is a unary function which returns "true" if its argument is 0 and "false" otherwise, "if...else..." is a conditional expression, "fn x => e" is a function with formal parameter "x" and body "e", and "(e e)" represents function application. (Don't worry about "rec x => e" for now! It is used for defining recursive functions.)As in last week's assignment, we will presume that we have a parser which parses input into an

*abstract syntax tree*, which your interpreter should use. The definition of the ML datatype isdatatype term = AST_ID of string | AST_NUM of int | AST_BOOL of bool | AST_SUCC | AST_PRED | AST_ISZERO | AST_IF of (term * term * term) | AST_ERROR | AST_FUN of (string * term) | AST_APP of (term * term) | AST_REC of (string *term)

As before this definition mirrors the BNF grammar given above; for instance, the constructor`AST_ID`makes a string into an identifier or variable, and the constructor`AST_FUN`makes a string representing the formal parameter and a term representing the body of the function into a function. Interpreting abstract syntax trees is much easier than trying to interpret terms directly.You are to write an ML function

`interp`that takes an abstract syntax tree representing a term and returns the result of evaluating it, which will also be an abstract syntax tree. The reduction should be done according to the rules given below. The expression "e => v" means that the term "e" evaluates to "v" (and then can be evaluated no further). The rules below are written for the expressions in the original grammar.*Your program should be written for the equivalent expressions using the abstract syntax trees (elements of type "term").*The base cases are:

(1)

`n => n`for`n`an integer.(2)

`true => true`, and similarly for`false`(3)

`error => error`(4)

`succ => succ`, and similarly for the other initial functionsThe other cases are slightly more complicated. They are written in the form of a rule in the manner of the following example:

b => true e1 => v (5) --------------------------- if b then e1 else e2 => v

We read the rule from the bottom up: if the expression is an if-then-else with components b, e1, and e2, and b evaluates to true and e1 returns v, then the entire expression returns v. Of course, we also have the symmetric ruleb => false e2 => v (6) ---------------------------- if b then e1 else e2 => v

The following are some of the cases for applications:e1 => succ e2 => n (7) ---------------------------- (e1 e2) => (n+1) e1 => pred e2 => 0 e1 =>pred e2 => (n+1) (8) --------------------------- -------------------------- (e1 e2) => 0 (e1 e2) => n e1 => iszero e2 => 0 e1 =>iszero e2 => (n+1) (9) ------------------------ --------------------------- (e1 e2) => true (e1 e2) => false

Here is a simple example using these rules: Evaluate`(if (iszero 0) then 1 else 2)`According to rules 5 and 6, we must first evaluate

`(iszero 0)`. By rule (9), this evaluates to`true`. Now by rule (5) (and the fact that`1 => 1`via rule 1), this evaluates to`1`.a. Use these rules to write an interpreter, interp: term -> term, for the subset of the language which does not include terms of the form AST_ID, AST_FUN, or AST_REC. If your interpreter tries to evaluate these three types of expressions, it should return the error, AST_ERROR.

__Note__: In my directory,`~kim/cs334stuff/ML.interps`, you will find a file,`parser.sml`, which is an ML program which parses__files__containing an expression from the simple BNF grammar given above into an expression using the AST terms. (ie: If a file "foo" contains`succ 7`,`parsefile foo`returns`AST_APP(AST_SUCC, AST_NUM 7)`, an expression in the proper form for use by your interpreter.) Feel free to use this method to generate abstract syntax trees, which is much easier than typing in the long AST terms directly. You will find in the same directory the skeleton of a program called "`PCF.interp.student.sml`", which also contains brief explanations and examples.b. The notation

`e[x := v]`indicates the textual substitution of v for all free occurrences of x in e. For example,`(succ x) [x:=1]`is the expression`(succ 1)`. Please write an ML function`subst`that takes a term,`t`, a string representing a variable,`v`, and a term,`s`, and returns`t`with all free occurrences of`v`(actually`AST_ID v`) replaced by`s`. Thus, the function application (corresponding to (`succ x) [x:=1]`, above),subst (AST_APP(AST_SUCC, AST_ID "x")) "x" (AST_NUM 1)

gives the answer`AST_APP(AST_SUCC, AST_NUM 1)`.Do

**not**substitute in for bound occurrences of variables. I.e., substituting 3 for x in`(x + ((fn x => 2+x) 8))`should result in`(3 + ((fn x => 2+x) 8))`. The formal parameter x, and its occurrences in the body of the function are not affected by the substitution because of the static scoping rules. (*Hint*: use pattern-matching on each constructor of the abstract syntax tree, calling`subst`recursively when you need to.)c. Using your substitution function, extend your interp function from part a to include

`AST_FUN`terms. The reduction for the terms involving`AST_FUN`should be done according to the rules given below:Functions by themselves don't do anything (just like

`succ`and`pred`above)(10) (fn x => e) => (fn x => e)

Computations occur when you apply these functions to arguments. The next rule defines call-by-value function application, as in ML. If the function is of the form`fn x => e`, evaluate the operand to a value,`v1`, substitute`v1`in for the formal parameter in`e`, and then evaluate the modified body:e1 => (fn x => e3) e2 => v1 e3[x:=v1] => v (11) -------------------------------------------------------- (e1 e2) => v

For instance, in evaluating the application((fn x => (succ x)) (succ 0))

we first note that the functions is already full evaluated, so we evaluate`(succ 0)`to`1`, and then plug this in for`x`in the body,`(succ x)`, of the function, obtaining`(succ 1)`, which evaluates to 2.Notice that while terms of the form

`(AST_VAR s)`can appear whenever`s`is a formal parameter, we never need to evaluate terms of the form`(AST_VAR s)`, because they are always replaced by the`subst`function before we evaluate the body of the function.We have not yet provided a reduction rule for

`AST_REC`terms. We will do that for the next homework. For now, just return`AST_ERROR`if your interpreter is applied to terms of the form`AST_VAR s`or`AST_REC(s,e)`.