Asymptotic Analysis
Big-O, Big-Θ, Big-Ω, recurrence relations, master theorem.
Big-O, Θ, Ω notation
Definitions, common pitfalls.
Why asymptotic? Algorithm performance depends on input size. We care about behavior as n → ∞ — not constants, lower-order terms, or small n.
THREE MAIN NOTATIONS:
Big O (Upper bound):
f(n) = O(g(n)) ↔ ∃ c, n₀ > 0 such that f(n) ≤ c·g(n) for all n ≥ n₀.
- Worst case behavior.
- "f grows AT MOST as fast as g."
Big Omega (Lower bound):
f(n) = Ω(g(n)) ↔ ∃ c, n₀ > 0 such that f(n) ≥ c·g(n) for all n ≥ n₀.
- Best case bound (in this sense).
- "f grows AT LEAST as fast as g."
Big Theta (Tight bound):
f(n) = Θ(g(n)) ↔ f = O(g) AND f = Ω(g).
- Exact growth rate.
- "f grows AT THE SAME RATE as g."
SMALL O AND SMALL OMEGA:
- Small o: f(n) = o(g(n)) means f grows STRICTLY slower than g (ratio → 0).
- Small omega: f(n) = ω(g(n)) means f grows STRICTLY faster (ratio → ∞).
So O = "at most" (could be tight); o = "strictly less."
COMMON GROWTH RATES (slow → fast):
| Class | Name | Example |
|---|---|---|
| O(1) | constant | Hash lookup |
| O(log log n) | log-log | (rare) |
| O(log n) | logarithmic | Binary search |
| O(√n) | sqrt | Naive primality testing up to √n |
| O(n) | linear | Single pass through array |
| O(n log n) | linearithmic | Merge sort, heap sort |
| O(n²) | quadratic | Bubble, insertion, selection sort |
| O(n³) | cubic | Floyd-Warshall, naive matrix mult |
| O(n^k) | polynomial | for fixed k |
| O(2ⁿ) | exponential | TSP brute force, subset enumeration |
| O(n!) | factorial | TSP via permutations, brute force |
ALGEBRA OF O-NOTATION:
- Sum rule: O(f) + O(g) = O(max(f, g)). E.g., O(n²) + O(n) = O(n²).
- Product rule: O(f) × O(g) = O(f·g). E.g., O(n) × O(log n) = O(n log n).
- Constants: O(c · f) = O(f) for any constant c > 0.
- Lower order absorbed: O(n² + n + 100) = O(n²).
- Log base irrelevant: O(log₂ n) = O(log₁₀ n) (= O(log n)).
- Exponents differ: O(2ⁿ) ≠ O(3ⁿ).
COMMON RECURRENCES & SOLUTIONS:
| Recurrence | Algorithm | Solution |
|---|---|---|
| T(n) = T(n−1) + 1 | Linear recursion | O(n) |
| T(n) = T(n−1) + n | Insertion sort | O(n²) |
| T(n) = T(n/2) + 1 | Binary search | O(log n) |
| T(n) = T(n/2) + n | Quickselect avg | O(n) |
| T(n) = 2T(n/2) + 1 | (rare) | O(n) |
| T(n) = 2T(n/2) + n | Merge sort | O(n log n) |
| T(n) = 2T(n/2) + n log n | — | O(n log² n) |
| T(n) = 2T(n/2) + n² | — | O(n²) |
| T(n) = 4T(n/2) + n | — | O(n²) |
| T(n) = 7T(n/2) + n² | Strassen | O(n^log₂ 7) ≈ O(n^2.807) |
(Use Master Theorem when applicable.)
TIME vs SPACE COMPLEXITY:
- Time: number of basic operations.
- Space: auxiliary memory used.
- Quick sort: O(n log n) time avg, O(log n) space (recursion stack).
- Merge sort: O(n log n) time, O(n) space.
WORST / AVERAGE / BEST CASE:
| Algo | Best | Avg | Worst |
|---|---|---|---|
| Linear search | 1 | n/2 | n |
| Binary search | 1 | log n | log n |
| Bubble sort | n | n² | n² |
| Insertion sort | n | n² | n² |
| Quick sort | n log n | n log n | n² |
| Merge sort | n log n | n log n | n log n |
| Hashing (good hash) | 1 | 1 | n |
Average case assumes some distribution of inputs (often uniform).
AMORTIZED ANALYSIS:
Some operations are expensive but happen rarely. Average over a sequence of operations.
Example: dynamic array (e.g., Python list). Append is O(1) amortized, even though occasional resize is O(n). The "blame" is spread over many cheap appends.
Methods: aggregate, accounting, potential.
COMPARISONS:
For large n, what's faster?
- Constant vs anything > constant → constant wins.
- log n vs n → log n wins.
- n vs n log n → linear wins.
- n^k vs 2ⁿ → polynomial wins (eventually).
- 2ⁿ vs n! → exponential wins (eventually); n! beats anything polynomial.
Real-world impact:
- For n = 10⁶: n² = 10¹² (way too slow); n log n = 2×10⁷ (fast).
- For n = 30: 2ⁿ ≈ 10⁹ (slow but doable); for n = 50: 2ⁿ = 10¹⁵ (too slow).
EXAM HOOKS:
- f(n) = O(g(n)) means f ≤ c·g eventually.
- Constant factors and lower-order terms ignored in asymptotic analysis.
- log n = O(n) but n ≠ O(log n).
- 2ⁿ ≠ O(2^(n/2)) (since 2^(n/2) is much slower).
- n! grows faster than 2ⁿ (by Stirling's).
- "Tight bound" = Θ. "Upper bound" = O.
Master theorem
Three cases for T(n) = aT(n/b) + f(n).
For a recurrence of the form T(n) = a·T(n/b) + f(n) where a ≥ 1, b > 1:
Compare f(n) with n^(log_b a) (call this critical polynomial). Three cases:
Case 1. If f(n) = O(n^(log_b a − ε)) for some ε > 0 → T(n) = Θ(n^(log_b a))
The recursion dominates; work is concentrated at leaves.
Case 2. If f(n) = Θ(n^(log_b a) · log^k n), k ≥ 0 → T(n) = Θ(n^(log_b a) · log^(k+1) n)
Balanced; recursion and combine work are comparable.
Case 3. If f(n) = Ω(n^(log_b a + ε)) AND a·f(n/b) ≤ c·f(n) for some c < 1 (regularity) → T(n) = Θ(f(n))
The combine step dominates.
Worked examples:
Merge sort: T(n) = 2T(n/2) + n. a=2, b=2 → n^(log₂ 2) = n. f(n) = n. Case 2 with k=0 → T(n) = Θ(n log n).
Binary search: T(n) = T(n/2) + 1. a=1, b=2 → n^(log₂ 1) = n^0 = 1. f(n) = 1 = Θ(1). Case 2 with k=0 → T(n) = Θ(log n).
Karatsuba: T(n) = 3T(n/2) + n. a=3, b=2 → n^(log₂ 3) ≈ n^1.585. f(n) = n. Case 1 → T(n) = Θ(n^1.585).
When master theorem doesn't apply:
- a is not a constant (e.g. T(n) = n·T(n/2))
- f(n) is not polynomially bounded vs n^(log_b a)
- Regularity condition fails in Case 3
For these, use substitution, recursion tree, or Akra-Bazzi.
Why asymptotic? Algorithm performance depends on input size. We care about behavior as n → ∞ — not constants, lower-order terms, or small n.
THREE MAIN NOTATIONS:
Big O (Upper bound):
f(n) = O(g(n)) ↔ ∃ c, n₀ > 0 such that f(n) ≤ c·g(n) for all n ≥ n₀.
- Worst case behavior.
- "f grows AT MOST as fast as g."
Big Omega (Lower bound):
f(n) = Ω(g(n)) ↔ ∃ c, n₀ > 0 such that f(n) ≥ c·g(n) for all n ≥ n₀.
- Best case bound (in this sense).
- "f grows AT LEAST as fast as g."
Big Theta (Tight bound):
f(n) = Θ(g(n)) ↔ f = O(g) AND f = Ω(g).
- Exact growth rate.
- "f grows AT THE SAME RATE as g."
SMALL O AND SMALL OMEGA:
- Small o: f(n) = o(g(n)) means f grows STRICTLY slower than g (ratio → 0).
- Small omega: f(n) = ω(g(n)) means f grows STRICTLY faster (ratio → ∞).
So O = "at most" (could be tight); o = "strictly less."
COMMON GROWTH RATES (slow → fast):
| Class | Name | Example |
|---|---|---|
| O(1) | constant | Hash lookup |
| O(log log n) | log-log | (rare) |
| O(log n) | logarithmic | Binary search |
| O(√n) | sqrt | Naive primality testing up to √n |
| O(n) | linear | Single pass through array |
| O(n log n) | linearithmic | Merge sort, heap sort |
| O(n²) | quadratic | Bubble, insertion, selection sort |
| O(n³) | cubic | Floyd-Warshall, naive matrix mult |
| O(n^k) | polynomial | for fixed k |
| O(2ⁿ) | exponential | TSP brute force, subset enumeration |
| O(n!) | factorial | TSP via permutations, brute force |
ALGEBRA OF O-NOTATION:
- Sum rule: O(f) + O(g) = O(max(f, g)). E.g., O(n²) + O(n) = O(n²).
- Product rule: O(f) × O(g) = O(f·g). E.g., O(n) × O(log n) = O(n log n).
- Constants: O(c · f) = O(f) for any constant c > 0.
- Lower order absorbed: O(n² + n + 100) = O(n²).
- Log base irrelevant: O(log₂ n) = O(log₁₀ n) (= O(log n)).
- Exponents differ: O(2ⁿ) ≠ O(3ⁿ).
COMMON RECURRENCES & SOLUTIONS:
| Recurrence | Algorithm | Solution |
|---|---|---|
| T(n) = T(n−1) + 1 | Linear recursion | O(n) |
| T(n) = T(n−1) + n | Insertion sort | O(n²) |
| T(n) = T(n/2) + 1 | Binary search | O(log n) |
| T(n) = T(n/2) + n | Quickselect avg | O(n) |
| T(n) = 2T(n/2) + 1 | (rare) | O(n) |
| T(n) = 2T(n/2) + n | Merge sort | O(n log n) |
| T(n) = 2T(n/2) + n log n | — | O(n log² n) |
| T(n) = 2T(n/2) + n² | — | O(n²) |
| T(n) = 4T(n/2) + n | — | O(n²) |
| T(n) = 7T(n/2) + n² | Strassen | O(n^log₂ 7) ≈ O(n^2.807) |
(Use Master Theorem when applicable.)
TIME vs SPACE COMPLEXITY:
- Time: number of basic operations.
- Space: auxiliary memory used.
- Quick sort: O(n log n) time avg, O(log n) space (recursion stack).
- Merge sort: O(n log n) time, O(n) space.
WORST / AVERAGE / BEST CASE:
| Algo | Best | Avg | Worst |
|---|---|---|---|
| Linear search | 1 | n/2 | n |
| Binary search | 1 | log n | log n |
| Bubble sort | n | n² | n² |
| Insertion sort | n | n² | n² |
| Quick sort | n log n | n log n | n² |
| Merge sort | n log n | n log n | n log n |
| Hashing (good hash) | 1 | 1 | n |
Average case assumes some distribution of inputs (often uniform).
AMORTIZED ANALYSIS:
Some operations are expensive but happen rarely. Average over a sequence of operations.
Example: dynamic array (e.g., Python list). Append is O(1) amortized, even though occasional resize is O(n). The "blame" is spread over many cheap appends.
Methods: aggregate, accounting, potential.
COMPARISONS:
For large n, what's faster?
- Constant vs anything > constant → constant wins.
- log n vs n → log n wins.
- n vs n log n → linear wins.
- n^k vs 2ⁿ → polynomial wins (eventually).
- 2ⁿ vs n! → exponential wins (eventually); n! beats anything polynomial.
Real-world impact:
- For n = 10⁶: n² = 10¹² (way too slow); n log n = 2×10⁷ (fast).
- For n = 30: 2ⁿ ≈ 10⁹ (slow but doable); for n = 50: 2ⁿ = 10¹⁵ (too slow).
EXAM HOOKS:
- f(n) = O(g(n)) means f ≤ c·g eventually.
- Constant factors and lower-order terms ignored in asymptotic analysis.
- log n = O(n) but n ≠ O(log n).
- 2ⁿ ≠ O(2^(n/2)) (since 2^(n/2) is much slower).
- n! grows faster than 2ⁿ (by Stirling's).
- "Tight bound" = Θ. "Upper bound" = O.