### CS52 - Spring 2017 - Class 22

#### Lecture notes

- Mentors for next semester (applications due today!)

• deterministic finite automata (DFA) review
- basic idea
- we have a set of states (indicated by circles)
- we have a start state where computation (indicated by an arrow)
- we have a collections of final states (indicate by states with an inner circle)
- for each state and each letter in our alphabet, we have a transition to another state

- computing
- we have a string as input on a tape
- we start at the beginning of the string
- read a symbol from the tape and transition to the state indicated by the model
- if:
- we end in a final state (i.e. get to the end of the string) we accept the string
- otherwise, if when we get to the end of the string/tape we're in a non-final state we reject

• DFAs over numbers
- we can use any alphabet we want
- if we use 1's and 0's we can interpret them as binary numbers!

- greater_5 (6): determines if the input string, when interpreted as a binary number, is greater than 5

- write a DFA that determines if a number is odd
- look at odd_number

• non-deterministic finite automata (NFA)
- almost identical definition to DFA except:
- for a given state and input, can go to zero, one or *more* states (rather than just a single one for DFAs)

- can have epsilon (or sometimes called lambda) transitions from one state to another
- doesn't read anything from the input, just transitions

- do not require that there is a transition for every alphabet letter for every state
- if you encounter a state without a transition for a particular letter, it does *not* accept that path

- tend to be a bit easier to create than DFAs

- An NFA accepts if *some* path exists through the DFA based on the input string that end in the final state

• some NFA examples (found in NFA_examples)
- start_end_a (1): start and end with a
- a(a|b)*a

- 2_or_3_a (2): strings of a's that have lengths divisible by 2 OR 3
- (aa)*|(aaa)*

- end_aa (3): ends in two a's
- (a|b)*aa

- aa_bb (4): has either aa or bb as a substring
- (a|b)*(aa|bb)(a|b)*

- no_bb (5): any string of a's and b's that doesn't have two adjacent b's
- a*b(aa*b)*a*

• Do NFAs give us more power, i.e. are there some languages that we can recognize with NFAs that we cannot recognize with DFAs?
- how would we show that NFAs are more powerful?
- find a language that can be represented by an NFA, but cannot be represented by a DFA

- how would we show that they're not more powerful?
- if we can show that for any NFA there is an equivalent DFA and vice versa, then we can show that they are equivalent, i.e. have the same representative power

- Given an DFA, how can we create an equivalent NFA?
- Easy... don't do anything!

- Given an NFA, how can we create an equivalent DFA?

- Consider the end_aa NFA (found in NFA_examples)
- Is aaaaa in the language?
- what states could we be in after reading the first a?
- q_0 or q_1
- what states would we be in after reading the second a?
- we could start in either q_0 *or* q_1
- if we were in q_0, we'd end up in q_0 or q_1
- if we were in q_1, we'd end up in q_2
- therefore, after reading two a's we could end up in any of: q_0 or q_1 or q_2
- the third a?
- we could be in q_0 or q_1 or q_2
- if we were in q_0 -> q_0 or q_1
- if we were in q_1 -> q_2
- if we were in q_2 -> reject
- does this matter?
- No. We only need to find *one* path through the state transitions that ends in an accepting state
- the fourth a?
- q_o or q_1 or q2
- the fifth a?
- q_o or q_1 or q2
- since q2 is *an* option, then there's a set of transitions between the states that gets us to an accepting state

- Is ababaa in the language?
- what states would we be in after reading the first a?
- q_0 or q_1
- second b?
- q_0
- third letter, a?
- q_0 or q_1
- fourth letter, b?
- q_0
- fifth letter, a?
- q_0 or q_1
- sixth letter, a?
- q_0 or q_1 or q_2
- since q_2 is *an* option, then there's a set of transitions between the states that gets us to an accepting state

- is ababa in the language?
- a:
- q_0 or q_1
- b:
- q_0
- a:
- q_0 or q_1
- b:
- q_0
- a:
- q_0 or q_1
- reject: no way to get to an accepting state

• Constructing a DFA from an NFA
- the basic idea is that we're going to create DFA states that represent one or more of the NFA states
- DFA state [Q] (where Q is 1 or more NFA states) will transition to DFA state [Q'] on letter l if there exists a transition to every q' \in Q' on letter l from *some* q \in Q
- start state is [q_0]
- accepting states are any [Q] where at least one q \in Q is an accepting state in the NFA

- how many DFA states can we have at most?
- 2^k - 1 where k is the number of NFA states
- think of each DFA state like a k bit number
- 1 if it represents the original NFA state
- 0 otherwise
- can't have all zeros, so 2^k - 1

- one algorithm
- create the start state (q_0)
- add [q_0] to process queue
- as long as process queue isn't empty:
- remove state s from process queue
- new_s = []
- for each letter l in the alphabet
- if any "old state", q_i, in s has a transition q_i: l -> q_j

- if new_s doesn't exist already
- create state new_s
- add new_s to process q
- if any states don't have transitions for all letters in the alphabet, create a "sink" state that transitions to itself on all letters and have states transition to here for any remaining alphabet letters

• A few examples (in NFA examples):
- We can construct a DFA from the end_aa NFA: end_aa_DFA

- We can construct a DFA from the start_end_a NFA: start_end_a_DFA

• Does this show that DFAs and NFAs are equivalent?
- Yes, given either one (a DFA or NFA) we can create through a deterministic process a corresponding machine of the other type
- Therefore, they can process/accept the same set of languages

• We can handle lambda/epsilon transitions in a similar way
- I'll let you figure out/investigate for those that are curious :)

• regular language
- any language that can be described by a DFA (or an NFA, remember, they're equivalent)
- any language that can be described by a regular expression!
- how would you prove this?

• What languages are *not* regular?
- 0^n 1^n for any n
- i.e. the language of some number of zeros followed by the *same* number of 1s

- why not?
- can you come up with a regular expression for this language?
- seems hard, since there's no tool for us to count
- can you come up with a DFA or NFA for this language?
- would have to have 2^(n+1) states
- states are the only way we can count
- only problem is that n isn't finite!
- consider any DFA that recognizes strings of 0^n 1^n for some fixed n
- won't recognize string O^(n+1) 1^(n+1)

- This is a bit of a "hand-wavy" proof
- see the pumping lemma (or take CS81) to see more concrete proof