Lecture 10 — 2017-12-20

Hands on Applicative

``````import Data.Char
import Control.Applicative``````

We wrote up a straightforward instance for `Maybe` and a more interesting instance for `Either e`:

``````instance Applicative (Either e) where
pure x = Right x -- because Left x would be ill typed!
(Right f) <*> (Right v) = Right \$ f v
err@(Left e) <*> _ = err
_ <*> err@(Left e) = err``````

Then we went over the `Applicative` definitions for lists. There were two possibilities: cartesian product…

``````instance Applicative [] where
pure x = [x]

[]     <*>  _ = []
_      <*> [] = []
(f:fs) <*> xs = map f xs ++ fs <*> xs``````

…and zipping:

``````newtype ZipList a = ZipList { getZipList :: [a] }
deriving (Eq, Show, Functor)

instance Applicative ZipList where
pure = ZipList . repeat
ZipList fs <*> ZipList xs = ZipList (zipWith (\$) fs xs)``````

Obey the laws

Like `Functor`, the `Applicative` type class is governed by laws.

Identity: `pure id <*> v = v` Composition: `pure (.) <*> u <*> v <*> w = u <*> (v <*> w)` Homomorphism: `pure f <*> pure x = pure (f x)` Interchange: `u <*> pure y = pure (\$ y) <*> u`

Note that `identity` is a generalization of `id <\$> v = v` from `Functor`, since `f <\$> x = pure f <*> x`.

Parsers

The parser we defined for arithmetic in lecture 9 was fine, but it had a problem: it was parsing right associatively, where `x + y + z` was interpreted as `x + (y + z)`. It’s not what we want—especially if we’re going to introduce an operator like subtraction, which isn’t commutative!

The solution is something called `chainl1`. First, let’s recapitulate our parsers.

``````newtype Parser a = Parser { parse :: String -> Maybe (a,String) }

instance Functor Parser where
fmap f p = Parser \$ \s ->
case parse p s of
Nothing -> Nothing
Just (v,s') -> Just (f v,s')

instance Applicative Parser where
pure a = Parser \$ \s -> Just (a,s)
f <*> a = Parser \$ \s ->  -- f :: Parser (a -> b), a :: Parser a
case parse f s of
Nothing -> Nothing
Just (g,s') -> parse (fmap g a) s' -- g :: a -> b, fmap g a :: Parser b

instance Alternative Parser where
empty = Parser \$ \s -> Nothing
p1 <|> p2 = Parser \$ \s ->
case parse p1 s of
Just (a,s') -> Just (a,s')
Nothing -> parse p2 s

ensure :: (a -> Bool) -> Parser a -> Parser a
ensure pred p = Parser \$ \s ->
case parse p s of
Nothing -> Nothing
Just (a,s') -> if pred a then Just (a,s') else Nothing

satisfy :: (Char -> Bool) -> Parser Char
satisfy p = Parser f
where f [] = Nothing
f (x:xs) = if p x then Just (x,xs) else Nothing

where f [] = Just (Nothing,[])
f (c:s) = Just (Just c,c:s)

int :: Parser Int
int = read <\$> some (satisfy isDigit)

spaces :: Parser ()
spaces = many (satisfy isSpace) *> pure ()

char :: Char -> Parser Char
char c = spaces *> satisfy (==c)

str :: String -> Parser [Char]
str s = spaces *> string s
where string []     = pure []
string (c:s') = (:) <\$> satisfy (==c) <*> string s'

num :: Parser Int
num = spaces *> int``````

As a warmup, we can define a parser that gives us `p`s separated by some number of `sep`s, i.e., the langauge `p (sep p)*`.

``````sepBy1 :: Parser a -> Parser sep -> Parser [a]
sepBy1 p sep = (:) <\$> p <*> (many (sep *> p))``````

For example, `parse (int`sepBy1`char ',') "1,2,3,4"` yields `Just ([1,2,3,4],"")`.

The `sepBy1` parser doesn’t look at what `sep` produces, but for arithmetic, the separator will be the operator we care about. To define `chainl1`, the parser that gives left chaining, we need our separator to not return some arbitrary, thrown away value—we want it to tell us which operator to use! Once we’ve collected all the operators, we can apply them left-associatively by using a fold:

``````chainl1 :: Parser a -> Parser (a -> a -> a) -> Parser a
chainl1 p sep = foldl (\acc (op,v) -> op acc v) <\$>
p <*> many ((\op v -> (op,v)) <\$> sep <*> p)``````

Refactored, we can verify that our grammar still works… try it out!

``````data Arith =
Num Int
| Plus Arith Arith
| Times Arith Arith
| Neg Arith
deriving Show

sub :: Arith -> Arith -> Arith
sub e1 e2 = Plus e1 (Neg e2)

term, factor, neg, atom :: Parser Arith
where addop =     (char '+' *> pure Plus)
<|> (char '-' *> pure sub)
factor = neg `chainl1` mulop
where mulop =     (char '*' *> pure Times)
neg = Neg <\$> (char '-'  *> atom) <|> atom
atom = Num <\$> num <|> (char '(' *> term <* char ')')``````

Parsing WhileNZ

Once we have a parser for terms, parsing WhileNZ is a matter of being careful about keywords and identifiers. Here’s a working parser for the basic, Pascal-like syntax.

``````type VarName = String

keywords :: [String]
keywords = ["WHILE", "DO", "END", "SKIP"]

kw :: String -> Parser ()
kw s = pure () <* spaces <* (sequenceA \$ map (satisfy . (==)) s) -- subtly incorrect on, e.g., DOINK, which isn't a kw

var :: Parser VarName
var = ensure (not . (`elem` keywords)) (spaces *> id)
where id = (:) <\$> satisfy isAlpha <*> many (satisfy isAlphaNum)

data WhileNZ =
Skip
| Assign VarName Arith
| Seq WhileNZ WhileNZ
| WhileNZ Arith WhileNZ
deriving Show

statement, statements :: Parser WhileNZ
statements = foldl1 Seq <\$> (statement `sepBy1` char ';')
statement =     const Skip <\$> kw "SKIP"
<|> Assign <\$> (var <* str ":=") <*> term
<|> WhileNZ <\$>
(kw "WHILE" *> (term <* kw "DO")) <*>
(statements <* kw "END")
<|> (\c b -> Seq c (WhileNZ b c)) <\$>
(kw "DO" *> (statements <* kw "WHILE")) <*>
(term <* kw "END")

prog1 = fst <\$> parse statements "x := 0"
prog2 = fst <\$> parse statements "x := 0;\nWHILE x - 5 DO x := x + 1 END; SKIP"``````

Digression on `chainl1`

Okay, but how did `chainl1` work? Let’s go over it:

``````chainl1 :: Parser a -> Parser (a -> a -> a) -> Parser a
chainl1 p sep =
foldl (\acc (op,v) -> op acc v) <\$> p <*> many ((\op v -> (op,v)) <\$> sep <*> p)``````

First, let’s refactor a bit—anonymous functions are great to write when you’re on top of things, but named functions are easier to talk about when you’re not sure what’s going on. (There’s a name for this phenomenon: the Kingdom of Nouns.)

``````chainl1 :: Parser a -> Parser (a -> a -> a) -> Parser a
chainl1 p sep =
foldl applyOp <\$> p <*> many (collectOp <\$> sep <*> p)
where applyOp acc (op,v) = op acc v
collectOp op v = (op,v)``````

What’s this code do? If we put on our `Applicative` parsing glasses, we’re going to parse `p` then then zero or more `sep`s followed by a `p`, i.e., something of the form `p (sep p)*`. In terms of what’s parsed, this code is just like `sepBy1`, which is hopefully easier to understand:

``````sepBy1 :: Parser a -> Parser sep -> Parser [a]
sepBy1 p sep = (:) <\$> p <*> (many (sep *> p))``````

If we turn our attention to types (put on your typechecking hat; who knew it was going to be a costume party), we can see that parsing with `p` produces a value of type `a`, but parsing with `sep` produces a function of type `a -> a -> a`. That is, our separator produces an operator. Our function `collectOp` makes it so that when we’re parsing zero or more `sep p`s, we collect a pair of things: the operation and the right-hand value. So `many (collectOp <\$> sep <*> p)` will parse a list of pairs of operations and values.

After parsing with `p <*> many (collectOp <\$> sep <*> p)`, we find ourselves holding two things: a value of type `a`, from parsing with `p`, and a list of type `[(a->a->a,a)]` from the `many ...`. The first thing, the result from `p`, is our leftmost thing we’ve parsed. The list is a left-to-right listing of (i) the operation between the thing to our left and (ii) another value of type a parsed by `p`. We then flatten everything out by folding—using `foldl`, because we’re trying to be left associative.

That is, we have…

``p0 [(op1,p1),(op2,p2),(op3,p3),...,(opn,pn)]``

and we produce…

``(opn (... (op3 (op2 (op1 p0 p1) p2) p3) ...) pn)``

…which looks like the right answer to me!