Lecture 10 (2017-12-20)
Hands on Applicative
> import Data.Char
> import Control.Applicative
We wrote up a straightforward instance for `Maybe` and a more
interesting instance for `Either e`:
```Haskell
instance Applicative (Either e) where
pure x = Right x -- because Left x would be ill typed!
(Right f) <*> (Right v) = Right $ f v
err@(Left e) <*> _ = err
_ <*> err@(Left e) = err
```
Then we went over the `Applicative` definitions for lists. There were two possibilities: cartesian product...
```Haskell
instance Applicative [] where
pure x = [x]
[] <*> _ = []
_ <*> [] = []
(f:fs) <*> xs = map f xs ++ fs <*> xs
```
...and zipping:
```
newtype ZipList a = ZipList { getZipList :: [a] }
deriving (Eq, Show, Functor)
instance Applicative ZipList where
pure = ZipList . repeat
ZipList fs <*> ZipList xs = ZipList (zipWith ($) fs xs)
```
### Obey the laws

Like `Functor`, the `Applicative` type class is governed by laws.
*Identity:* `pure id <*> v = v`
*Composition:* `pure (.) <*> u <*> v <*> w = u <*> (v <*> w)`
*Homomorphism:* `pure f <*> pure x = pure (f x)`
*Interchange:* `u <*> pure y = pure ($ y) <*> u`
Note that `identity` is a generalization of `id <$> v = v` from
`Functor`, since `f <$> x = pure f <*> x`.
### Parsers

The parser we defined for arithmetic in [lecture 9](Lec09.html) was
fine, but it had a problem: it was parsing right associatively, where
`x + y + z` was interpreted as `x + (y + z)`. It's not what we
want---especially if we're going to introduce an operator like
subtraction, which isn't commutative!
The solution is something called `chainl1`. First, let's recapitulate
our parsers.
> newtype Parser a = Parser { parse :: String -> Maybe (a,String) }
>
> instance Functor Parser where
> fmap f p = Parser $ \s ->
> case parse p s of
> Nothing -> Nothing
> Just (v,s') -> Just (f v,s')
>
> instance Applicative Parser where
> pure a = Parser $ \s -> Just (a,s)
> f <*> a = Parser $ \s -> -- f :: Parser (a -> b), a :: Parser a
> case parse f s of
> Nothing -> Nothing
> Just (g,s') -> parse (fmap g a) s' -- g :: a -> b, fmap g a :: Parser b
>
> instance Alternative Parser where
> empty = Parser $ \s -> Nothing
> p1 <|> p2 = Parser $ \s ->
> case parse p1 s of
> Just (a,s') -> Just (a,s')
> Nothing -> parse p2 s
>
> ensure :: (a -> Bool) -> Parser a -> Parser a
> ensure pred p = Parser $ \s ->
> case parse p s of
> Nothing -> Nothing
> Just (a,s') -> if pred a then Just (a,s') else Nothing
>
> satisfy :: (Char -> Bool) -> Parser Char
> satisfy p = Parser f
> where f [] = Nothing
> f (x:xs) = if p x then Just (x,xs) else Nothing
>
> lookahead :: Parser (Maybe Char)
> lookahead = Parser f
> where f [] = Just (Nothing,[])
> f (c:s) = Just (Just c,c:s)
>
> int :: Parser Int
> int = read <$> some (satisfy isDigit)
>
> spaces :: Parser ()
> spaces = many (satisfy isSpace) *> pure ()
>
> char :: Char -> Parser Char
> char c = spaces *> satisfy (==c)
>
> str :: String -> Parser [Char]
> str s = spaces *> string s
> where string [] = pure []
> string (c:s') = (:) <$> satisfy (==c) <*> string s'
>
> num :: Parser Int
> num = spaces *> int
As a warmup, we can define a parser that gives us `p`s separated by
some number of `sep`s, i.e., the langauge `p (sep p)*`.
> sepBy1 :: Parser a -> Parser sep -> Parser [a]
> sepBy1 p sep = (:) <$> p <*> (many (sep *> p))
For example, `parse (int `sepBy1` char ',') "1,2,3,4"` yields `Just
([1,2,3,4],"")`.
The `sepBy1` parser doesn't look at what `sep` produces, but for
arithmetic, the separator will be the operator we care about. To
define `chainl1`, the parser that gives left chaining, we need our
separator to not return some arbitrary, thrown away value---we want it
to tell us which operator to use! Once we've collected all the
operators, we can apply them left-associatively by using a fold:
> chainl1 :: Parser a -> Parser (a -> a -> a) -> Parser a
> chainl1 p sep = foldl (\acc (op,v) -> op acc v) <$>
> p <*> many ((\op v -> (op,v)) <$> sep <*> p)
Refactored, we can verify that our grammar still works... try it out!
> data Arith =
> Num Int
> | Plus Arith Arith
> | Times Arith Arith
> | Neg Arith
> deriving Show
>
> sub :: Arith -> Arith -> Arith
> sub e1 e2 = Plus e1 (Neg e2)
>
> term, factor, neg, atom :: Parser Arith
> term = factor `chainl1` addop
> where addop = (char '+' *> pure Plus)
> <|> (char '-' *> pure sub)
> factor = neg `chainl1` mulop
> where mulop = (char '*' *> pure Times)
> neg = Neg <$> (char '-' *> atom) <|> atom
> atom = Num <$> num <|> (char '(' *> term <* char ')')
### Parsing WhileNZ

Once we have a parser for terms, parsing WhileNZ is a matter of being
careful about keywords and identifiers. Here's a working parser for
the basic, Pascal-like syntax.
> type VarName = String
>
> keywords :: [String]
> keywords = ["WHILE", "DO", "END", "SKIP"]
>
> kw :: String -> Parser ()
> kw s = pure () <* spaces <* (sequenceA $ map (satisfy . (==)) s) -- subtly incorrect on, e.g., DOINK, which isn't a kw
>
> var :: Parser VarName
> var = ensure (not . (`elem` keywords)) (spaces *> id)
> where id = (:) <$> satisfy isAlpha <*> many (satisfy isAlphaNum)
>
> data WhileNZ =
> Skip
> | Assign VarName Arith
> | Seq WhileNZ WhileNZ
> | WhileNZ Arith WhileNZ
> deriving Show
>
> statement, statements :: Parser WhileNZ
> statements = foldl1 Seq <$> (statement `sepBy1` char ';')
> statement = const Skip <$> kw "SKIP"
> <|> Assign <$> (var <* str ":=") <*> term
> <|> WhileNZ <$>
> (kw "WHILE" *> (term <* kw "DO")) <*>
> (statements <* kw "END")
> <|> (\c b -> Seq c (WhileNZ b c)) <$>
> (kw "DO" *> (statements <* kw "WHILE")) <*>
> (term <* kw "END")
>
> prog1 = fst <$> parse statements "x := 0"
> prog2 = fst <$> parse statements "x := 0;\nWHILE x - 5 DO x := x + 1 END; SKIP"
### Digression on `chainl1`

Okay, but how did `chainl1` work? Let's go over it:
```
chainl1 :: Parser a -> Parser (a -> a -> a) -> Parser a
chainl1 p sep =
foldl (\acc (op,v) -> op acc v) <$> p <*> many ((\op v -> (op,v)) <$> sep <*> p)
```
First, let's refactor a bit---anonymous functions are great to write
when you're on top of things, but named functions are easier to talk
about when you're not sure what's going on. (There's a name for this
phenomenon: the [Kingdom of Nouns](https://steve-yegge.blogspot.com/2006/03/execution-in-kingdom-of-nouns.html).)
```
chainl1 :: Parser a -> Parser (a -> a -> a) -> Parser a
chainl1 p sep =
foldl applyOp <$> p <*> many (collectOp <$> sep <*> p)
where applyOp acc (op,v) = op acc v
collectOp op v = (op,v)
```
What's this code do? If we put on our `Applicative` parsing glasses, we're going to parse `p` then then zero or more `sep`s followed by a `p`, i.e., something of the form `p (sep p)*`. In terms of what's parsed, this code is just like `sepBy1`, which is hopefully easier to understand:
```
sepBy1 :: Parser a -> Parser sep -> Parser [a]
sepBy1 p sep = (:) <$> p <*> (many (sep *> p))
```
If we turn our attention to types (put on your typechecking hat; who
knew it was going to be a costume party), we can see that parsing with
`p` produces a value of type `a`, but parsing with `sep` produces a *function*
of type `a -> a -> a`. That is, our separator produces an operator. Our
function `collectOp` makes it so that when we're parsing zero or more
`sep p`s, we collect a pair of things: the operation and the right-hand
value. So `many (collectOp <$> sep <*> p)` will parse a list of pairs of
operations and values.
After parsing with `p <*> many (collectOp <$> sep <*> p)`, we find
ourselves holding two things: a value of type `a`, from parsing with `p`,
and a list of type `[(a->a->a,a)]` from the `many ...`. The first thing, the
result from `p`, is our leftmost thing we've parsed. The list is a
left-to-right listing of (i) the operation between the thing to our
left and (ii) another value of type a parsed by `p`. We then flatten
everything out by folding---using `foldl`, because we're trying to be
left associative.
That is, we have...
```
p0 [(op1,p1),(op2,p2),(op3,p3),...,(opn,pn)]
```
and we produce...
```
(opn (... (op3 (op2 (op1 p0 p1) p2) p3) ...) pn)
```
...which looks like the right answer to me!