Regular Languages

Regex, pumping lemma, closure properties.

Pumping lemma

Proof technique for non-regularity.

Regular languages — DFA, NFA, regex, pumping lemma
Notes

FORMAL LANGUAGES & AUTOMATA

A language L over alphabet Σ is a set of strings (each a finite sequence of symbols from Σ).

Chomsky hierarchy:

  • Type 0: recursively enumerable (Turing machines).
  • Type 1: context-sensitive (linear bounded automata).
  • Type 2: context-free (pushdown automata).
  • Type 3: regular (finite automata).

DETERMINISTIC FINITE AUTOMATON (DFA)

A 5-tuple (Q, Σ, δ, q₀, F):

  • Q: finite set of states.
  • Σ: input alphabet.
  • δ: Q × Σ → Q (transition function — deterministic).
  • q₀: start state.
  • F ⊆ Q: accept states.

Run on input w = w₁w₂...wₙ: start at q₀; apply δ for each symbol. Accept if end in F.

Language of DFA: set of strings it accepts.


NON-DETERMINISTIC FINITE AUTOMATON (NFA)

Same 5-tuple, but δ: Q × Σ → P(Q) (powerset). May have multiple next-states for same input. Also allows ε-transitions (no input consumed).

NFA accepts w if at least one possible run ends in accept state.


DFA vs NFA

  • Both accept the same class of languages — regular languages.
  • NFA can be smaller (fewer states for same language).
  • Every NFA can be converted to a DFA via subset construction (potentially exponential blow-up: 2^n states).

REGULAR EXPRESSIONS

  • ∅: empty language.
  • ε: empty string.
  • a (a ∈ Σ): symbol.
  • R₁ + R₂ (or R₁ | R₂): union.
  • R₁R₂: concatenation.
  • R*: Kleene star (zero or more).
  • R+: one or more (= RR*).
  • R?: zero or one.

Examples:

  • (a+b)*: any string over {a, b}.
  • ab: zeros or more a's, then zero or more b's.
  • (ab)*: ε, ab, abab, ababab, ...
  • 0(0+1)*0: starts and ends with 0, length ≥ 2.

REGEX ↔ FA EQUIVALENCES

Kleene's theorem: regular languages = languages described by regex = languages accepted by FA.

Conversions:

  • Regex → NFA (Thompson's construction).
  • NFA → DFA (subset construction).
  • DFA → regex (state elimination).

CLOSURE PROPERTIES OF REGULAR LANGUAGES:

Closed under:

  • Union, Intersection, Complement.
  • Concatenation.
  • Kleene star.
  • Reversal.
  • Homomorphism.

So combining regular languages with these operations yields regular.


MINIMIZATION OF DFA

A DFA is minimal if it has fewest states among all equivalent DFAs.

Hopcroft's algorithm or table-filling method.

  • Two states are equivalent if for all input strings, they go to accept/reject the same way.
  • Algorithm: start with partition {F, Q−F}; refine until stable.

PUMPING LEMMA (for proving non-regularity)

If L is regular, ∃ p > 0 such that any string s ∈ L with |s| ≥ p can be written s = xyz with:

  • |y| ≥ 1.
  • |xy| ≤ p.
  • ∀ k ≥ 0, xy^k z ∈ L.

Usage: to prove L is NOT regular, find a string s where pumping always produces a string outside L.

Classic examples of non-regular languages:

  • {aⁿbⁿ : n ≥ 0} (requires counting).
  • {ww : w ∈ Σ*} (requires remembering w).
  • Palindromes (over Σ with ≥ 2 symbols).

EXAMPLES

Q1. DFA accepting strings over {a, b} containing "ab" as substring.

States: q₀ (no a yet or just b), q₁ (seen a, expecting b), q_accept.

  • q₀ --a--> q₁, --b--> q₀.
  • q₁ --b--> q_accept, --a--> q₁.
  • q_accept loops on both.

Q2. Regex for strings over {0,1} with even number of 0s: (1010)1.

Q3. Show L = {0ⁿ1ⁿ : n ≥ 0} is not regular.

  • Suppose it's regular with pumping length p.
  • s = 0^p 1^p. Decomp xyz with |xy|≤p, |y|≥1.
  • y consists only of 0s (since |xy|≤p). Pump y → more 0s than 1s. Not in L. Contradiction.

EXAM HOOKS:

  • DFA and NFA accept same class (regular). But NFA → DFA may blow up exponentially.
  • Regex ⇔ FA (Kleene's theorem).
  • Regular languages closed under union, intersection, complement, concatenation, Kleene star.
  • Pumping lemma is for proving NON-regularity.
  • Cannot count arbitrarily large numbers → 0ⁿ1ⁿ is not regular.
  • Palindromes (binary) are not regular but are context-free.