# Lecture 9 — 2018-02-13

## Parsing with Applicative

This lecture is written in literate Haskell; you can download the raw source.

We wrote up a straightforward instance for `Maybe`

and a more interesting instance for `Either e`

:

```
instance Applicative (Either e) where
pure x = Right x -- because Left x would be ill typed!
(Right f) <*> (Right v) = Right $ f v
err@(Left e) <*> _ = err
_ <*> err@(Left e) = err
```

Then we went over the `Applicative`

definitions for lists. There were two possibilities: cartesian product…

```
instance Applicative [] where
pure x = [x]
[] <*> _ = []
_ <*> [] = []
(f:fs) <*> xs = map f xs ++ fs <*> xs
```

…and zipping:

```
newtype ZipList a = ZipList { getZipList :: [a] }
deriving (Eq, Show, Functor)
instance Applicative ZipList where
pure = ZipList . repeat
ZipList fs <*> ZipList xs = ZipList (zipWith ($) fs xs)
```

### Readers

We define an instance for readers, too… a sort of prelude to defining our parsers.

```
instance Applicative ((->) r) where
pure v r = v
frab <*> fra = \r -> frab r (fra r)
```

We were able to use this instance to construct functions quickly and easily, like:

```
eogth = (&&) <$> even <*> (>100)
aos = (||) <$> isAlpha <*> isSpace
```

The first function returns true on numbers that are even and greater than one hundred; the latter returns true for characters that are alphabetical or whitespace.

### Obey the laws

Like `Functor`

, the `Applicative`

type class is governed by laws.

*Identity:* `pure id <*> v = v`

*Composition:* `pure (.) <*> u <*> v <*> w = u <*> (v <*> w)`

*Homomorphism:* `pure f <*> pure x = pure (f x)`

*Interchange:* `u <*> pure y = pure ($ y) <*> u`

Note that `identity`

is a generalization of `id <$> v = v`

from `Functor, since`

f <$> x = pure f <*> x`.

We defined “classic style” parsers in terms of a lexer (`String -> [Token]`

) and a parser (`[Token] -> AST`

), but we spent most of class looking at an alternative model: Applicative parsing.

```
import Data.Char
import Control.Applicative
newtype Parser a = Parser { parse :: String -> Maybe (a,String) }
letter :: Char -> Parser Char
letter c = Parser $ \s ->
case s of
c':s' | c == c' -> Just (c,s')
_ -> Nothing
letters :: String -> Parser String
letters str = Parser $ \s ->
if take (length str) s == str
then Just (str,drop (length str) s)
else Nothing
```

EG `parse (letter 'c') "chocolate"`

```
instance Functor Parser where
fmap f p = Parser $ \s ->
case parse p s of
Nothing -> Nothing
Just (v,s') -> Just (f v,s')
```

Notice how, if you squint, you can see that this `Functor`

instance of `Parser`

is a combination of the instances for `Maybe`

and for readers:

```
instance Functor Maybe where
fmap f Nothing = Nothing
fmap f (Just v) = Just (f v)
instance Functor ((->) r) where
-- (a -> b) -> (r -> a) -> (r -> b)
fmap f g x = f (g x)
```

```
instance Applicative Parser where
pure a = Parser $ \s -> Just (a,s)
f <*> a = Parser $ \s -> -- f :: Parser (a -> b), a :: Parser a
case parse f s of
Nothing -> Nothing
Just (g,s') -> parse (fmap g a) s' -- g :: a -> b, fmap g a :: Parser b
```

(Our `Applicative`

instance is *also* a combination of the instances for `Maybe`

and reader. You can take my word for it… or verify for yourself.)

EG `parse ((\x -> [x,x,x]) <$> letter 'c') "chocolate"`

```
letterC = letter 'c'
strCH = (\c h -> [c,h]) <$> letter 'c' <*> letter 'h'
```

EG `parse strCH "chocolate"`

```
string :: String -> Parser String
string [] = pure ""
string (c:s) = (:) <$> (letter c) <*> string s
```

EG `parse (string "choco") "chocolate"`

EG `parse (string "vanilla") "chocolate"`

```
eof :: Parser ()
eof = Parser $ \s -> if null s then Just ((),"") else Nothing
strCH' = (\c h _ -> [c,h]) <$> letter 'c' <*> letter 'h' <*> eof
```

EG ‘parse strCH’ “ch”`is`

Just (“ch”,[])`

EG `parse strCH' "chocolate"`

yields `Nothing`

.

Notice how we ignored a value for `eof`

. We can use `(<*)`

and `(*>)`

to save ourselves some trouble, writing, e.g.,

```
strCH'' = (\c h -> [c,h]) <$> letter 'c' <*> letter 'h' <* eof
ensure :: (a -> Bool) -> Parser a -> Parser a
ensure pred p = Parser $ \s ->
case parse p s of
Nothing -> Nothing
Just (a,s') -> if pred a then Just (a,s') else Nothing
satisfy :: (Char -> Bool) -> Parser Char
satisfy p = Parser f
where f [] = Nothing
f (x:xs) = if p x then Just (x,xs) else Nothing
```

Observe that `letter c`

is equivalent to `satisfy (==c)`

.

```
lookahead :: Parser (Maybe Char)
lookahead = Parser f
where f [] = Just (Nothing,[])
f (c:s) = Just (Just c,c:s)
```

We could manually define integer parsing:

```
integer :: Parser Int
integer = Parser $ \s ->
let (digits,rest) = span isDigit s in
if null digits then Nothing else Just (read digits,rest)
```

But it’s nicer if we define a notion of choice:

```
class Applicative f => Alternative f where
empty :: f a
(<|>) :: f a -> f a -> f a
many, some :: Alternative f => f a -> f [a]
some p = (:) <$> p <*> many p
many p = some p <|> pure []
instance Alternative Maybe where
-- empty :: f a
empty = Nothing
-- (<|>) :: f a -> f a -> f a
Just x <|> _ = Just x
Nothing <|> r = r
-- empty <|> f == f
-- f <|> empty == f
```

```
instance Alternative Parser where
empty = Parser $ \s -> Nothing
p1 <|> p2 = Parser $ \s ->
case parse p1 s of
Just (a,s') -> Just (a,s')
Nothing -> parse p2 s
integer' :: Parser String
integer' = read <$> someDigits
where someDigits = (:) <$> satisfy isDigit <*> moreDigits
moreDigits = someDigits <|> pure []
int :: Parser Int
int = read <$> some (satisfy isDigit)
```

EG `parse int "8675309"`

EG `parse int "5551212zoop"`

EG `parse int "KL51212"`

`threeInts = (\n1 n2 n3 -> [n1,n2,n3]) <$> (int <* char ',') <*> (int <* char ',') <*> (ensure (>0) int)`

EG `parse threeInts "1,2,3"`

EG `parse threeInts "1,2,0"`

Let’s build a parser for arithmetic expressions. We’ll keep it as an invariant that we parse spaces up before each actual phrase, so “2 + 2” and “2+2” and " 2 +2" all yield `Plus (Num 2) (Num 2)`

.

```
spaces :: Parser ()
spaces = many (satisfy isSpace) *> pure ()
char :: Char -> Parser Char
char c = spaces *> satisfy (==c)
plus, times :: Parser Char
plus = char '+'
times = char '*'
num :: Parser Int
num = spaces *> int
data Arith =
Num Int
| Plus Arith Arith
| Times Arith Arith
deriving Show
term, factor, atom :: Parser Arith
term = Plus <$> factor <* plus <*> term
<|> factor
factor = Times <$> atom <* times <*> factor
<|> atom
atom = Num <$> num
<|> (char '(' *> term <* char ')')
```

Compare this with the CFG:

```
Term ::= Factor + Term | Factor
Factor ::= Atom * Factor | Atom
Atom ::= n | ( Term )
```

Note that this parse has a (benign, for now) bug: its arithmetic is *right* associative.