A Deterministic Finite Automaton (DFA) is a 5-tuple (Q, Σ, δ, q₀, F):

Q: finite set of states
Σ: input alphabet
δ: Q × Σ → Q (transition function — exactly one next state per (state, symbol))
q₀ ∈ Q: start state
F ⊆ Q: accept (final) states

A DFA accepts an input string if, starting from q₀ and applying δ for each symbol, it ends in some state in F.

NFA (Non-deterministic): δ: Q × (Σ ∪ {ε}) → 2^Q. Multiple next states (or none). ε-transitions allowed.

Key theorem: every NFA has an equivalent DFA. The conversion (subset construction) can blow up the state count exponentially (up to 2^|Q|), but the recognized language is the same.

Regular languages. A language is regular iff it's recognized by some DFA (equivalently NFA, ε-NFA, regular expression).

Closure properties — regular languages are closed under:

Union, intersection, complement, difference
Concatenation
Kleene star
Reverse
Homomorphism

Decidable problems for DFA:

Membership (is string w in L?) — O(|w|).
Emptiness (is L = ∅?) — graph reachability.
Equivalence of two DFAs — comparing their minimal forms.
Minimization — Hopcroft's algorithm O(n log n).

Pumping Lemma for regular languages. If L is regular, there exists a pumping length p such that every string s ∈ L with |s| ≥ p can be split as s = xyz, satisfying:

|xy| ≤ p
|y| ≥ 1
xy^i z ∈ L for all i ≥ 0.

Used to prove a language is NOT regular (by contradiction).

Classic non-regular language: L = { aⁿbⁿ | n ≥ 0 }.

Proof: assume regular with pumping length p. Take s = a^p b^p. By condition 1, |xy| ≤ p, so xy consists only of a's. y must be one or more a's. Pump once → xy²z has more a's than b's → not in L. Contradiction. So L is not regular.

Other classics not regular:

{ ww | w ∈ Σ* }
{ a^p | p is prime }
Balanced parentheses (with arbitrary nesting depth) — requires a stack, hence Context-Free, not regular.

Regex → NFA → DFA pipeline:

Regex (POSIX or extended) → Thompson construction → ε-NFA → Subset construction → DFA → Minimization → Optimal DFA.

This is the heart of grep, lex/flex, and most pattern-matching engines (though many use NFAs directly to support backreferences, which take you beyond regular).

GATE-style sample question. How many states in the minimal DFA for L = { strings over {0, 1} divisible by 3 when read as binary }?

Answer: 3 — states represent remainders 0, 1, 2 modulo 3. Start = accept = remainder 0. Transition function: on 0 → 2r mod 3; on 1 → (2r+1) mod 3.

Finite Automata

DFA construction

NFA to DFA conversion