Regular Languages
Regex, pumping lemma, closure properties.
Pumping lemma
Proof technique for non-regularity.
FORMAL LANGUAGES & AUTOMATA
A language L over alphabet Σ is a set of strings (each a finite sequence of symbols from Σ).
Chomsky hierarchy:
- Type 0: recursively enumerable (Turing machines).
- Type 1: context-sensitive (linear bounded automata).
- Type 2: context-free (pushdown automata).
- Type 3: regular (finite automata).
DETERMINISTIC FINITE AUTOMATON (DFA)
A 5-tuple (Q, Σ, δ, q₀, F):
- Q: finite set of states.
- Σ: input alphabet.
- δ: Q × Σ → Q (transition function — deterministic).
- q₀: start state.
- F ⊆ Q: accept states.
Run on input w = w₁w₂...wₙ: start at q₀; apply δ for each symbol. Accept if end in F.
Language of DFA: set of strings it accepts.
NON-DETERMINISTIC FINITE AUTOMATON (NFA)
Same 5-tuple, but δ: Q × Σ → P(Q) (powerset). May have multiple next-states for same input. Also allows ε-transitions (no input consumed).
NFA accepts w if at least one possible run ends in accept state.
DFA vs NFA
- Both accept the same class of languages — regular languages.
- NFA can be smaller (fewer states for same language).
- Every NFA can be converted to a DFA via subset construction (potentially exponential blow-up: 2^n states).
REGULAR EXPRESSIONS
- ∅: empty language.
- ε: empty string.
- a (a ∈ Σ): symbol.
- R₁ + R₂ (or R₁ | R₂): union.
- R₁R₂: concatenation.
- R*: Kleene star (zero or more).
- R+: one or more (= RR*).
- R?: zero or one.
Examples:
- (a+b)*: any string over {a, b}.
- ab: zeros or more a's, then zero or more b's.
- (ab)*: ε, ab, abab, ababab, ...
- 0(0+1)*0: starts and ends with 0, length ≥ 2.
REGEX ↔ FA EQUIVALENCES
Kleene's theorem: regular languages = languages described by regex = languages accepted by FA.
Conversions:
- Regex → NFA (Thompson's construction).
- NFA → DFA (subset construction).
- DFA → regex (state elimination).
CLOSURE PROPERTIES OF REGULAR LANGUAGES:
Closed under:
- Union, Intersection, Complement.
- Concatenation.
- Kleene star.
- Reversal.
- Homomorphism.
So combining regular languages with these operations yields regular.
MINIMIZATION OF DFA
A DFA is minimal if it has fewest states among all equivalent DFAs.
Hopcroft's algorithm or table-filling method.
- Two states are equivalent if for all input strings, they go to accept/reject the same way.
- Algorithm: start with partition {F, Q−F}; refine until stable.
PUMPING LEMMA (for proving non-regularity)
If L is regular, ∃ p > 0 such that any string s ∈ L with |s| ≥ p can be written s = xyz with:
- |y| ≥ 1.
- |xy| ≤ p.
- ∀ k ≥ 0, xy^k z ∈ L.
Usage: to prove L is NOT regular, find a string s where pumping always produces a string outside L.
Classic examples of non-regular languages:
- {aⁿbⁿ : n ≥ 0} (requires counting).
- {ww : w ∈ Σ*} (requires remembering w).
- Palindromes (over Σ with ≥ 2 symbols).
EXAMPLES
Q1. DFA accepting strings over {a, b} containing "ab" as substring.
States: q₀ (no a yet or just b), q₁ (seen a, expecting b), q_accept.
- q₀ --a--> q₁, --b--> q₀.
- q₁ --b--> q_accept, --a--> q₁.
- q_accept loops on both.
Q2. Regex for strings over {0,1} with even number of 0s: (1010)1.
Q3. Show L = {0ⁿ1ⁿ : n ≥ 0} is not regular.
- Suppose it's regular with pumping length p.
- s = 0^p 1^p. Decomp xyz with |xy|≤p, |y|≥1.
- y consists only of 0s (since |xy|≤p). Pump y → more 0s than 1s. Not in L. Contradiction.
EXAM HOOKS:
- DFA and NFA accept same class (regular). But NFA → DFA may blow up exponentially.
- Regex ⇔ FA (Kleene's theorem).
- Regular languages closed under union, intersection, complement, concatenation, Kleene star.
- Pumping lemma is for proving NON-regularity.
- Cannot count arbitrarily large numbers → 0ⁿ1ⁿ is not regular.
- Palindromes (binary) are not regular but are context-free.