Finite Automata
DFA, NFA, equivalence, minimization.
DFA construction
States, transitions, accept states; example languages.
A Deterministic Finite Automaton (DFA) is a 5-tuple (Q, Σ, δ, q₀, F):
- Q: finite set of states
- Σ: input alphabet
- δ: Q × Σ → Q (transition function — exactly one next state per (state, symbol))
- q₀ ∈ Q: start state
- F ⊆ Q: accept (final) states
A DFA accepts an input string if, starting from q₀ and applying δ for each symbol, it ends in some state in F.
NFA (Non-deterministic): δ: Q × (Σ ∪ {ε}) → 2^Q. Multiple next states (or none). ε-transitions allowed.
Key theorem: every NFA has an equivalent DFA. The conversion (subset construction) can blow up the state count exponentially (up to 2^|Q|), but the recognized language is the same.
Regular languages. A language is regular iff it's recognized by some DFA (equivalently NFA, ε-NFA, regular expression).
Closure properties — regular languages are closed under:
- Union, intersection, complement, difference
- Concatenation
- Kleene star
- Reverse
- Homomorphism
Decidable problems for DFA:
- Membership (is string w in L?) — O(|w|).
- Emptiness (is L = ∅?) — graph reachability.
- Equivalence of two DFAs — comparing their minimal forms.
- Minimization — Hopcroft's algorithm O(n log n).
Pumping Lemma for regular languages. If L is regular, there exists a pumping length p such that every string s ∈ L with |s| ≥ p can be split as s = xyz, satisfying:
- |xy| ≤ p
- |y| ≥ 1
- xy^i z ∈ L for all i ≥ 0.
Used to prove a language is NOT regular (by contradiction).
Classic non-regular language: L = { aⁿbⁿ | n ≥ 0 }.
Proof: assume regular with pumping length p. Take s = a^p b^p. By condition 1, |xy| ≤ p, so xy consists only of a's. y must be one or more a's. Pump once → xy²z has more a's than b's → not in L. Contradiction. So L is not regular.
Other classics not regular:
- { ww | w ∈ Σ* }
- { a^p | p is prime }
- Balanced parentheses (with arbitrary nesting depth) — requires a stack, hence Context-Free, not regular.
Regex → NFA → DFA pipeline:
Regex (POSIX or extended) → Thompson construction → ε-NFA → Subset construction → DFA → Minimization → Optimal DFA.
This is the heart of grep, lex/flex, and most pattern-matching engines (though many use NFAs directly to support backreferences, which take you beyond regular).
GATE-style sample question. How many states in the minimal DFA for L = { strings over {0, 1} divisible by 3 when read as binary }?
Answer: 3 — states represent remainders 0, 1, 2 modulo 3. Start = accept = remainder 0. Transition function: on 0 → 2r mod 3; on 1 → (2r+1) mod 3.
NFA to DFA conversion
Subset construction algorithm.