Hard Lesson 03: Algorithms and Complexity Analysis#

Master algorithm design, complexity analysis, and efficient problem-solving techniques.

Learning Objectives#

By the end of this lesson, you will be able to:

  • ✅ Analyze time and space complexity using Big O notation

  • ✅ Understand and apply all major complexity classes (O(1), O(log n), O(n), O(n log n), O(n²), O(2ⁿ))

  • ✅ Implement and compare sorting algorithms (bubble, selection, insertion, merge, quick, heap)

  • ✅ Apply searching algorithms (linear, binary, interpolation)

  • ✅ Design recursive solutions with proper base cases

  • ✅ Solve problems using dynamic programming (memoization and tabulation)

  • ✅ Apply greedy algorithms for optimization problems

  • ✅ Implement graph algorithms (DFS, BFS, Dijkstra’s, topological sort)

  • ✅ Choose the right algorithm for specific problems

Prerequisites#

  • Strong understanding of Python data structures (lists, dicts, sets)

  • Familiarity with recursion

  • Basic understanding of graphs and trees

  • Knowledge of object-oriented programming

Why Algorithms and Complexity Matter#

Real-World Impact:

  • Google Search: Returns results in milliseconds from trillions of pages (efficient algorithms)

  • Netflix Recommendations: Processes billions of data points (optimized complexity)

  • GPS Navigation: Finds shortest path in real-time (Dijkstra’s algorithm)

  • Database Indexing: Fast queries on massive datasets (B-trees, O(log n) search)

  • Compiler Optimization: Efficient code generation (graph algorithms)

  • Machine Learning: Training on huge datasets (efficient matrix operations)

Cost Implications:

  • O(n²) vs O(n log n): At 1 million items, could mean hours vs seconds

  • Poor algorithm choice can cost millions in cloud computing

  • Efficient algorithms enable real-time systems


Part 1: Big O Notation - The Language of Complexity#

What is Big O?#

Big O notation describes how algorithm performance scales with input size.

Key Principles:

  • Drop constants: O(2n) → O(n), O(100) → O(1)

  • Drop lower-order terms: O(n² + n) → O(n²)

  • Focus on worst case: What happens when input is largest/worst?

  • Ignore coefficients: O(3n log n) → O(n log n)

Common Complexity Classes (Best to Worst)#

Complexity

Name

Example

1K items

1M items

O(1)

Constant

Array access

1 op

1 op

O(log n)

Logarithmic

Binary search

~10 ops

~20 ops

O(n)

Linear

Linear search

1K ops

1M ops

O(n log n)

Linearithmic

Merge sort

~10K ops

~20M ops

O(n²)

Quadratic

Bubble sort

1M ops

1T ops

O(2ⁿ)

Exponential

Recursive Fibonacci

Insane

Impossible

O(n!)

Factorial

Traveling salesman

Death

Heat death of universe

import time

def measure_time(func, *args):
    """Utility to measure execution time."""
    start = time.time()
    result = func(*args)
    elapsed = time.time() - start
    return result, elapsed

# O(1) - Constant Time
# Operations that take same time regardless of input size
def get_first_element(arr):
    """O(1) - Array access is constant time."""
    return arr[0] if arr else None

def hash_lookup(dictionary, key):
    """O(1) average case - Hash table lookup."""
    return dictionary.get(key)

# O(log n) - Logarithmic Time
# Halves problem size each iteration
def binary_search(arr, target):
    """O(log n) - Binary search on sorted array."""
    left, right = 0, len(arr) - 1
    
    while left <= right:
        mid = (left + right) // 2
        if arr[mid] == target:
            return mid
        elif arr[mid] < target:
            left = mid + 1
        else:
            right = mid - 1
    return -1

# O(n) - Linear Time
# Processes each element once
def linear_search(arr, target):
    """O(n) - Must check each element."""
    for i, val in enumerate(arr):
        if val == target:
            return i
    return -1

def find_max(arr):
    """O(n) - Must examine all elements."""
    if not arr:
        return None
    max_val = arr[0]
    for num in arr[1:]:
        if num > max_val:
            max_val = num
    return max_val

# O(n²) - Quadratic Time
# Nested loops over same data
def find_duplicates(arr):
    """O(n²) - Nested loop comparing all pairs."""
    duplicates = []
    for i in range(len(arr)):
        for j in range(i + 1, len(arr)):
            if arr[i] == arr[j] and arr[i] not in duplicates:
                duplicates.append(arr[i])
    return duplicates

# Test complexity differences
test_arr = list(range(10000))  # Sorted array

_, t_const = measure_time(get_first_element, test_arr)
_, t_log = measure_time(binary_search, test_arr, 9999)
_, t_linear = measure_time(linear_search, test_arr, 9999)

print("Complexity Comparison (10,000 elements):")
print(f"  O(1)     - Constant:     {t_const*1000:.6f} ms")
print(f"  O(log n) - Logarithmic:  {t_log*1000:.6f} ms")
print(f"  O(n)     - Linear:       {t_linear*1000:.6f} ms")
print(f"\nO(log n) is {t_linear/t_log:.0f}x faster than O(n) for this size!")

Analyzing Code for Complexity#

Rules of thumb:

  1. Simple statements: O(1)

  2. Single loop: O(n) where n is loop iterations

  3. Nested loops: Multiply complexities (usually O(n²))

  4. Divide and conquer: Often O(log n) or O(n log n)

  5. Sequential statements: Add complexities, keep dominant term

def analyze_complexity_examples(arr):
    """
    Demonstrate complexity analysis.
    """
    n = len(arr)
    
    # Example 1: O(1) + O(n) = O(n)
    first = arr[0]  # O(1)
    for x in arr:   # O(n)
        print(x)
    # Overall: O(n) - keep dominant term
    
    # Example 2: O(n) + O(n) = O(n)
    for x in arr:   # O(n)
        print(x)
    for x in arr:   # O(n)
        print(x)
    # Overall: O(2n) = O(n) - drop constants
    
    # Example 3: O(n) * O(n) = O(n²)
    for i in arr:   # O(n)
        for j in arr:  # O(n) for each i
            print(i, j)
    # Overall: O(n²) - nested loops multiply
    
    # Example 4: O(n²) + O(n) = O(n²)
    for i in arr:      # O(n²)
        for j in arr:
            print(i, j)
    for x in arr:      # O(n)
        print(x)
    # Overall: O(n²) - keep dominant term

print("\nComplexity Analysis Rules:")
print("  1. Drop constants: O(2n) → O(n)")
print("  2. Drop lower terms: O(n² + n) → O(n²)")
print("  3. Sequential: Add, keep dominant")
print("  4. Nested loops: Multiply")
print("  5. Divide input: Often O(log n)")

Part 2: Sorting Algorithms - Comparison and Implementation#

Elementary Sorting Algorithms (O(n²))#

Good for small datasets or nearly sorted data.

def bubble_sort(arr):
    """
    Bubble Sort - O(n²) time, O(1) space
    
    Repeatedly swaps adjacent elements if they're in wrong order.
    Largest element "bubbles" to end each pass.
    
    Pros: Simple, stable, in-place
    Cons: Slow for large data
    """
    arr = arr.copy()
    n = len(arr)
    
    for i in range(n):
        swapped = False
        # Last i elements are already sorted
        for j in range(0, n - i - 1):
            if arr[j] > arr[j + 1]:
                arr[j], arr[j + 1] = arr[j + 1], arr[j]
                swapped = True
        
        # Optimization: stop if no swaps (already sorted)
        if not swapped:
            break
    
    return arr

def selection_sort(arr):
    """
    Selection Sort - O(n²) time, O(1) space
    
    Finds minimum element and swaps to beginning.
    
    Pros: Simple, fewer swaps than bubble
    Cons: Not stable, always O(n²)
    """
    arr = arr.copy()
    n = len(arr)
    
    for i in range(n):
        # Find minimum in remaining unsorted portion
        min_idx = i
        for j in range(i + 1, n):
            if arr[j] < arr[min_idx]:
                min_idx = j
        
        # Swap minimum to current position
        arr[i], arr[min_idx] = arr[min_idx], arr[i]
    
    return arr

def insertion_sort(arr):
    """
    Insertion Sort - O(n²) worst case, O(n) best case, O(1) space
    
    Builds sorted array one element at a time.
    Like sorting playing cards in your hand.
    
    Pros: Stable, efficient for small/nearly sorted data, online algorithm
    Cons: O(n²) for large random data
    """
    arr = arr.copy()
    
    for i in range(1, len(arr)):
        key = arr[i]
        j = i - 1
        
        # Shift elements greater than key to the right
        while j >= 0 and arr[j] > key:
            arr[j + 1] = arr[j]
            j -= 1
        
        # Insert key at correct position
        arr[j + 1] = key
    
    return arr

# Test elementary sorting
import random
test_data = [64, 34, 25, 12, 22, 11, 90]

print("Elementary Sorting Algorithms:")
print(f"  Original:       {test_data}")
print(f"  Bubble Sort:    {bubble_sort(test_data)}")
print(f"  Selection Sort: {selection_sort(test_data)}")
print(f"  Insertion Sort: {insertion_sort(test_data)}")

Advanced Sorting Algorithms (O(n log n))#

Efficient for large datasets.

def merge_sort(arr):
    """
    Merge Sort - O(n log n) time, O(n) space
    
    Divide and conquer:
    1. Divide array into halves
    2. Recursively sort each half
    3. Merge sorted halves
    
    Pros: Stable, guaranteed O(n log n), good for linked lists
    Cons: O(n) extra space, not in-place
    """
    if len(arr) <= 1:
        return arr
    
    # Divide
    mid = len(arr) // 2
    left = merge_sort(arr[:mid])
    right = merge_sort(arr[mid:])
    
    # Conquer (merge)
    return merge(left, right)

def merge(left, right):
    """Merge two sorted arrays."""
    result = []
    i = j = 0
    
    # Compare and merge
    while i < len(left) and j < len(right):
        if left[i] <= right[j]:
            result.append(left[i])
            i += 1
        else:
            result.append(right[j])
            j += 1
    
    # Append remaining elements
    result.extend(left[i:])
    result.extend(right[j:])
    return result

def quicksort(arr):
    """
    Quick Sort - O(n log n) average, O(n²) worst case, O(log n) space
    
    Divide and conquer:
    1. Choose pivot
    2. Partition: elements < pivot left, > pivot right
    3. Recursively sort partitions
    
    Pros: Fast in practice, in-place (with optimization), cache-friendly
    Cons: Not stable, worst case O(n²) with bad pivots
    """
    if len(arr) <= 1:
        return arr
    
    # Choose pivot (middle element)
    pivot = arr[len(arr) // 2]
    
    # Partition
    left = [x for x in arr if x < pivot]
    middle = [x for x in arr if x == pivot]
    right = [x for x in arr if x > pivot]
    
    # Recursively sort and combine
    return quicksort(left) + middle + quicksort(right)

def heapsort(arr):
    """
    Heap Sort - O(n log n) time, O(1) space
    
    1. Build max heap from array
    2. Repeatedly extract maximum
    
    Pros: In-place, guaranteed O(n log n)
    Cons: Not stable, slower than quicksort in practice
    """
    def heapify(arr, n, i):
        """Maintain max heap property."""
        largest = i
        left = 2 * i + 1
        right = 2 * i + 2
        
        if left < n and arr[left] > arr[largest]:
            largest = left
        if right < n and arr[right] > arr[largest]:
            largest = right
        
        if largest != i:
            arr[i], arr[largest] = arr[largest], arr[i]
            heapify(arr, n, largest)
    
    arr = arr.copy()
    n = len(arr)
    
    # Build max heap
    for i in range(n // 2 - 1, -1, -1):
        heapify(arr, n, i)
    
    # Extract elements from heap
    for i in range(n - 1, 0, -1):
        arr[0], arr[i] = arr[i], arr[0]
        heapify(arr, i, 0)
    
    return arr

# Benchmark sorting algorithms
test_sizes = [100, 1000, 5000]

print("\nSorting Algorithm Performance (random data):\n")
print(f"{'Algorithm':<15} {'n=100':<12} {'n=1000':<12} {'n=5000':<12}")
print("-" * 55)

for size in test_sizes:
    data = [random.randint(1, 1000) for _ in range(size)]
    
    if size <= 1000:  # Only test O(n²) on small data
        if size == 100:
            _, t = measure_time(bubble_sort, data)
            print(f"{'Bubble Sort':<15} {t*1000:>10.4f}ms", end="")
        if size == 1000:
            _, t = measure_time(bubble_sort, data)
            print(f" {t*1000:>10.4f}ms", end="")
        print()

for algo_name, algo_func in [("Merge Sort", merge_sort), 
                              ("Quick Sort", quicksort),
                              ("Heap Sort", heapsort)]:
    print(f"{algo_name:<15}", end="")
    for size in test_sizes:
        data = [random.randint(1, 1000) for _ in range(size)]
        _, t = measure_time(algo_func, data)
        print(f" {t*1000:>10.4f}ms", end="")
    print()

print("\n✅ O(n log n) algorithms scale much better!")

Sorting Algorithm Comparison#

Algorithm

Best

Average

Worst

Space

Stable

Notes

Bubble

O(n)

O(n²)

O(n²)

O(1)

Yes

Simple, slow

Selection

O(n²)

O(n²)

O(n²)

O(1)

No

Fewer swaps

Insertion

O(n)

O(n²)

O(n²)

O(1)

Yes

Good for small/sorted

Merge

O(n log n)

O(n log n)

O(n log n)

O(n)

Yes

Guaranteed performance

Quick

O(n log n)

O(n log n)

O(n²)

O(log n)

No

Fast in practice

Heap

O(n log n)

O(n log n)

O(n log n)

O(1)

No

In-place, guaranteed


Part 3: Searching Algorithms#

Binary Search - The O(log n) Workhorse#

def binary_search_iterative(arr, target):
    """
    Binary Search (Iterative) - O(log n) time, O(1) space
    
    Requires: Sorted array
    Strategy: Repeatedly halve search space
    """
    left, right = 0, len(arr) - 1
    
    while left <= right:
        mid = (left + right) // 2
        
        if arr[mid] == target:
            return mid
        elif arr[mid] < target:
            left = mid + 1  # Search right half
        else:
            right = mid - 1  # Search left half
    
    return -1  # Not found

def binary_search_recursive(arr, target, left=0, right=None):
    """
    Binary Search (Recursive) - O(log n) time, O(log n) space
    """
    if right is None:
        right = len(arr) - 1
    
    if left > right:
        return -1
    
    mid = (left + right) // 2
    
    if arr[mid] == target:
        return mid
    elif arr[mid] < target:
        return binary_search_recursive(arr, target, mid + 1, right)
    else:
        return binary_search_recursive(arr, target, left, mid - 1)

def binary_search_leftmost(arr, target):
    """
    Find leftmost occurrence of target.
    Useful for finding insertion point or first occurrence.
    """
    left, right = 0, len(arr)
    
    while left < right:
        mid = (left + right) // 2
        if arr[mid] < target:
            left = mid + 1
        else:
            right = mid
    
    return left

# Test binary search variants
sorted_arr = [1, 2, 3, 3, 3, 4, 5, 6, 7, 8, 9]

print("Binary Search Variants:")
print(f"  Array: {sorted_arr}")
print(f"  Find 3 (iterative): index {binary_search_iterative(sorted_arr, 3)}")
print(f"  Find 3 (recursive): index {binary_search_recursive(sorted_arr, 3)}")
print(f"  Find leftmost 3:    index {binary_search_leftmost(sorted_arr, 3)}")
print(f"  Find 10 (not found): {binary_search_iterative(sorted_arr, 10)}")

Part 4: Recursion - Divide and Conquer#

Understanding Recursion#

def factorial(n):
    """
    Classic recursion example.
    
    Base case: n <= 1
    Recursive case: n * factorial(n-1)
    """
    if n <= 1:
        return 1
    return n * factorial(n - 1)

def fibonacci_recursive(n):
    """
    Naive Fibonacci - O(2ⁿ) - VERY SLOW!
    
    Demonstrates exponential time complexity.
    """
    if n <= 1:
        return n
    return fibonacci_recursive(n - 1) + fibonacci_recursive(n - 2)

def power(base, exp):
    """
    Fast exponentiation - O(log n)
    
    Uses divide and conquer:
    - If exp is even: base^exp = (base²)^(exp/2)
    - If exp is odd: base^exp = base * base^(exp-1)
    """
    if exp == 0:
        return 1
    if exp == 1:
        return base
    
    if exp % 2 == 0:
        half = power(base, exp // 2)
        return half * half
    else:
        return base * power(base, exp - 1)

# Tower of Hanoi - Classic recursion problem
def tower_of_hanoi(n, source, target, auxiliary):
    """
    Tower of Hanoi - O(2ⁿ) moves
    
    Move n disks from source to target using auxiliary.
    """
    if n == 1:
        print(f"  Move disk 1 from {source} to {target}")
        return
    
    # Move n-1 disks to auxiliary
    tower_of_hanoi(n - 1, source, auxiliary, target)
    
    # Move largest disk to target
    print(f"  Move disk {n} from {source} to {target}")
    
    # Move n-1 disks from auxiliary to target
    tower_of_hanoi(n - 1, auxiliary, target, source)

# Test recursion
print("Recursion Examples:")
print(f"  5! = {factorial(5)}")
print(f"  2^10 = {power(2, 10)}")
print(f"\nTower of Hanoi (3 disks):")
tower_of_hanoi(3, 'A', 'C', 'B')

# Demonstrate exponential growth
print("\n⚠️  Fibonacci recursion is SLOW:")
for n in [10, 20, 30]:
    _, t = measure_time(fibonacci_recursive, n)
    print(f"  fib({n}): {t*1000:.2f} ms")
print("  (Notice exponential growth!)")

Part 5: Dynamic Programming - Optimization Through Memoization#

Memoization (Top-Down)#

def fibonacci_memoized(n, memo=None):
    """
    Fibonacci with Memoization - O(n) time, O(n) space
    
    Cache results to avoid redundant calculations.
    Transforms O(2ⁿ) to O(n)!
    """
    if memo is None:
        memo = {}
    
    if n in memo:
        return memo[n]
    
    if n <= 1:
        return n
    
    memo[n] = fibonacci_memoized(n - 1, memo) + fibonacci_memoized(n - 2, memo)
    return memo[n]

# Compare memoized vs naive
print("Memoization Performance:")
for n in [10, 20, 30, 100]:
    _, t = measure_time(fibonacci_memoized, n)
    result = fibonacci_memoized(n)
    print(f"  fib({n}): {result:,} - {t*1000:.4f} ms")
print("\n✅ Memoization makes it FAST!")

Tabulation (Bottom-Up)#

def fibonacci_tabulation(n):
    """
    Fibonacci with Tabulation - O(n) time, O(n) space
    
    Build table bottom-up, no recursion.
    """
    if n <= 1:
        return n
    
    dp = [0] * (n + 1)
    dp[1] = 1
    
    for i in range(2, n + 1):
        dp[i] = dp[i - 1] + dp[i - 2]
    
    return dp[n]

def fibonacci_optimized(n):
    """
    Space-optimized Fibonacci - O(n) time, O(1) space
    
    Only keep last two values.
    """
    if n <= 1:
        return n
    
    prev, curr = 0, 1
    for _ in range(2, n + 1):
        prev, curr = curr, prev + curr
    
    return curr

print("Tabulation vs Space-Optimized:")
n = 100
print(f"  Tabulation:      fib({n}) = {fibonacci_tabulation(n):,}")
print(f"  Space-optimized: fib({n}) = {fibonacci_optimized(n):,}")

Classic DP Problems#

def longest_common_subsequence(s1, s2):
    """
    LCS - O(mn) time and space
    
    Find length of longest subsequence common to both strings.
    
    DP Formula:
    - If chars match: dp[i][j] = dp[i-1][j-1] + 1
    - Else: dp[i][j] = max(dp[i-1][j], dp[i][j-1])
    """
    m, n = len(s1), len(s2)
    dp = [[0] * (n + 1) for _ in range(m + 1)]
    
    for i in range(1, m + 1):
        for j in range(1, n + 1):
            if s1[i - 1] == s2[j - 1]:
                dp[i][j] = dp[i - 1][j - 1] + 1
            else:
                dp[i][j] = max(dp[i - 1][j], dp[i][j - 1])
    
    return dp[m][n]

def knapsack_01(weights, values, capacity):
    """
    0/1 Knapsack - O(nW) time and space
    
    Maximize value within weight capacity.
    Each item can be taken (1) or not (0).
    
    DP Formula:
    - If weight fits: dp[i][w] = max(value + dp[i-1][w-weight], dp[i-1][w])
    - Else: dp[i][w] = dp[i-1][w]
    """
    n = len(weights)
    dp = [[0] * (capacity + 1) for _ in range(n + 1)]
    
    for i in range(1, n + 1):
        for w in range(1, capacity + 1):
            if weights[i - 1] <= w:
                # Take item or don't take it
                dp[i][w] = max(
                    values[i - 1] + dp[i - 1][w - weights[i - 1]],
                    dp[i - 1][w]
                )
            else:
                # Can't take (too heavy)
                dp[i][w] = dp[i - 1][w]
    
    return dp[n][capacity]

def coin_change(coins, amount):
    """
    Coin Change - O(amount * len(coins))
    
    Find minimum coins to make amount.
    
    DP Formula:
    dp[i] = min(dp[i], dp[i - coin] + 1) for each coin
    """
    dp = [float('inf')] * (amount + 1)
    dp[0] = 0
    
    for i in range(1, amount + 1):
        for coin in coins:
            if coin <= i:
                dp[i] = min(dp[i], dp[i - coin] + 1)
    
    return dp[amount] if dp[amount] != float('inf') else -1

# Test DP problems
print("Dynamic Programming Problems:\n")

s1, s2 = "ABCDGH", "AEDFHR"
print(f"LCS of '{s1}' and '{s2}': {longest_common_subsequence(s1, s2)}")

weights = [2, 3, 4, 5]
values = [3, 4, 5, 6]
capacity = 8
print(f"\nKnapsack (weights={weights}, values={values}, capacity={capacity}):")
print(f"  Max value: {knapsack_01(weights, values, capacity)}")

coins = [1, 5, 10, 25]
amount = 63
print(f"\nCoin Change (coins={coins}, amount={amount}):")
print(f"  Min coins: {coin_change(coins, amount)}")

Part 6: Graph Algorithms#

Graph Representation and Traversal#

from collections import deque, defaultdict
import heapq

class Graph:
    """
    Graph implementation using adjacency list.
    
    Space: O(V + E) where V = vertices, E = edges
    """
    def __init__(self, directed=False):
        self.graph = defaultdict(list)
        self.directed = directed
    
    def add_edge(self, u, v, weight=1):
        """Add edge from u to v."""
        self.graph[u].append((v, weight))
        if not self.directed:
            self.graph[v].append((u, weight))
    
    def dfs(self, start, visited=None):
        """
        Depth-First Search - O(V + E) time, O(V) space
        
        Explores as far as possible before backtracking.
        Uses stack (recursion call stack).
        """
        if visited is None:
            visited = set()
        
        visited.add(start)
        print(start, end=' ')
        
        for neighbor, _ in self.graph[start]:
            if neighbor not in visited:
                self.dfs(neighbor, visited)
        
        return visited
    
    def bfs(self, start):
        """
        Breadth-First Search - O(V + E) time, O(V) space
        
        Explores all neighbors before going deeper.
        Uses queue.
        """
        visited = set([start])
        queue = deque([start])
        
        while queue:
            vertex = queue.popleft()
            print(vertex, end=' ')
            
            for neighbor, _ in self.graph[vertex]:
                if neighbor not in visited:
                    visited.add(neighbor)
                    queue.append(neighbor)
        
        return visited
    
    def dijkstra(self, start):
        """
        Dijkstra's Shortest Path - O((V + E) log V) with heap
        
        Finds shortest path from start to all vertices.
        Works only with non-negative weights.
        
        Uses greedy approach + priority queue.
        """
        distances = {vertex: float('inf') for vertex in self.graph}
        distances[start] = 0
        
        # Priority queue: (distance, vertex)
        pq = [(0, start)]
        visited = set()
        
        while pq:
            current_dist, current = heapq.heappop(pq)
            
            if current in visited:
                continue
            
            visited.add(current)
            
            for neighbor, weight in self.graph[current]:
                distance = current_dist + weight
                
                if distance < distances[neighbor]:
                    distances[neighbor] = distance
                    heapq.heappush(pq, (distance, neighbor))
        
        return distances
    
    def topological_sort(self):
        """
        Topological Sort (Kahn's Algorithm) - O(V + E)
        
        Linear ordering of vertices such that for every edge u→v,
        u comes before v.
        
        Only works on Directed Acyclic Graphs (DAGs).
        """
        # Calculate in-degrees
        in_degree = defaultdict(int)
        for vertex in self.graph:
            for neighbor, _ in self.graph[vertex]:
                in_degree[neighbor] += 1
        
        # Start with vertices having 0 in-degree
        queue = deque([v for v in self.graph if in_degree[v] == 0])
        result = []
        
        while queue:
            vertex = queue.popleft()
            result.append(vertex)
            
            for neighbor, _ in self.graph[vertex]:
                in_degree[neighbor] -= 1
                if in_degree[neighbor] == 0:
                    queue.append(neighbor)
        
        # Check for cycle
        if len(result) != len(self.graph):
            return None  # Graph has cycle
        
        return result

# Test graph algorithms
g = Graph(directed=False)
g.add_edge(0, 1)
g.add_edge(0, 2)
g.add_edge(1, 3)
g.add_edge(2, 4)
g.add_edge(3, 4)

print("Graph Traversal:\n")
print("DFS from 0: ", end="")
g.dfs(0)
print("\nBFS from 0: ", end="")
g.bfs(0)
print("\n")

# Weighted graph for Dijkstra's
wg = Graph(directed=True)
wg.add_edge('A', 'B', 4)
wg.add_edge('A', 'C', 2)
wg.add_edge('B', 'C', 1)
wg.add_edge('B', 'D', 5)
wg.add_edge('C', 'D', 8)
wg.add_edge('C', 'E', 10)
wg.add_edge('D', 'E', 2)

print("Dijkstra's Shortest Path from 'A':")
distances = wg.dijkstra('A')
for vertex, dist in sorted(distances.items()):
    print(f"  {vertex}: {dist}")

# DAG for topological sort
dag = Graph(directed=True)
dag.add_edge(5, 2)
dag.add_edge(5, 0)
dag.add_edge(4, 0)
dag.add_edge(4, 1)
dag.add_edge(2, 3)
dag.add_edge(3, 1)

print("\nTopological Sort:")
print(f"  {dag.topological_sort()}")

Part 7: Space Complexity#

Time isn’t everything - memory matters too!

import sys

def sum_iterative(n):
    """
    Sum 1 to n iteratively.
    Time: O(n), Space: O(1)
    """
    total = 0
    for i in range(1, n + 1):
        total += i
    return total

def sum_recursive(n):
    """
    Sum 1 to n recursively.
    Time: O(n), Space: O(n) due to call stack
    """
    if n <= 0:
        return 0
    return n + sum_recursive(n - 1)

def sum_formula(n):
    """
    Sum 1 to n using formula.
    Time: O(1), Space: O(1)
    """
    return n * (n + 1) // 2

print("Space Complexity Comparison:\n")
n = 1000
print(f"Sum 1 to {n}:")
print(f"  Iterative (O(1) space): {sum_iterative(n)}")
print(f"  Recursive (O(n) space): {sum_recursive(n)}")
print(f"  Formula   (O(1) space): {sum_formula(n)}")

print("\n📊 Space complexity matters for:")
print("  - Embedded systems (limited memory)")
print("  - Recursion depth limits (stack overflow)")
print("  - Large-scale data processing")
print("  - Cloud costs (memory pricing)")

Exercises#

Exercise 1: Implement and Analyze Quick Select#

Implement the QuickSelect algorithm to find the kth smallest element in O(n) average time.

# Your code here
import random

def quickselect(arr, k):
    """
    Find kth smallest element (0-indexed).
    
    Average: O(n)
    Worst: O(n²)
    
    TODO: Implement using partition similar to quicksort
    """
    pass

# Test
# arr = [3, 1, 4, 1, 5, 9, 2, 6]
# print(f"Array: {arr}")
# print(f"3rd smallest: {quickselect(arr, 2)}")
# print(f"Sorted for verification: {sorted(arr)}")

Exercise 2: Longest Increasing Subsequence#

Find the length of the longest increasing subsequence using dynamic programming.

# Your code here

def longest_increasing_subsequence(arr):
    """
    Find length of LIS.
    
    Example: [10, 9, 2, 5, 3, 7, 101, 18] → 4 ([2, 3, 7, 101])
    
    Time: O(n²) with DP, O(n log n) with binary search
    
    TODO: Implement using DP
    """
    pass

# Test
# arr = [10, 9, 2, 5, 3, 7, 101, 18]
# print(f"Array: {arr}")
# print(f"LIS length: {longest_increasing_subsequence(arr)}")

Exercise 3: Detect Cycle in Graph#

Implement cycle detection in both directed and undirected graphs.

# Your code here

def has_cycle_undirected(graph):
    """
    Detect cycle in undirected graph using DFS.
    
    TODO: Implement
    """
    pass

def has_cycle_directed(graph):
    """
    Detect cycle in directed graph using DFS.
    
    Use three colors: white (unvisited), gray (visiting), black (done)
    
    TODO: Implement
    """
    pass

# Test
# Create test graphs and verify

Exercise 4: Implement A* Pathfinding#

Implement A* algorithm with Manhattan distance heuristic.

# Your code here

def a_star(grid, start, goal):
    """
    A* pathfinding algorithm.
    
    Args:
        grid: 2D array (0 = open, 1 = blocked)
        start: (row, col)
        goal: (row, col)
    
    Returns:
        Path as list of (row, col) tuples
    
    TODO: Implement using priority queue
    f(n) = g(n) + h(n)
    where g(n) = cost so far, h(n) = heuristic estimate to goal
    """
    pass

# Test
# grid = [
#     [0, 0, 0, 0, 0],
#     [0, 1, 1, 0, 0],
#     [0, 0, 0, 0, 0],
#     [0, 0, 1, 1, 0],
#     [0, 0, 0, 0, 0]
# ]
# start, goal = (0, 0), (4, 4)
# path = a_star(grid, start, goal)
# print(f"Path: {path}")

Pro Tips#

🎯 Algorithm Selection Guide#

Sorting:

  • Small data (< 50): Insertion sort

  • Need stability: Merge sort

  • Average case speed: Quick sort

  • Guaranteed O(n log n): Merge or Heap sort

  • In-place required: Quick or Heap sort

Searching:

  • Sorted data: Binary search (O(log n))

  • Unsorted small data: Linear search

  • Unsorted large data: Hash table or sort first

Graph Algorithms:

  • Shortest path (unweighted): BFS

  • Shortest path (weighted, non-negative): Dijkstra’s

  • Shortest path (negative weights): Bellman-Ford

  • All-pairs shortest path: Floyd-Warshall

  • Detect cycle: DFS

  • Connected components: DFS or Union-Find

⚠️ Common Mistakes#

  1. Not considering space complexity: O(1) vs O(n) space matters

  2. Using O(n²) when O(n log n) exists: Sort before searching

  3. Forgetting base cases in recursion: Causes infinite recursion

  4. Not handling edge cases: Empty arrays, single elements

  5. Premature optimization: Write correct code first, optimize later

  6. Ignoring integer overflow: Use appropriate data types

  7. Not testing with large inputs: Performance issues only show at scale

  8. Confusing best/average/worst case: Know which applies

💡 Optimization Strategies#

  1. Hash tables for O(1) lookup: Trade space for time

  2. Two pointers: Avoid nested loops when possible

  3. Sliding window: For subarray/substring problems

  4. Memoization: Cache expensive recursive calls

  5. Sort first: Enables binary search and two pointers

  6. Use right data structure: Array, linked list, tree, graph?

  7. Amortized analysis: Average over sequence of operations

  8. Lazy evaluation: Compute only when needed


Key Takeaways#

  1. Big O describes scalability, not absolute speed

  2. O(1) < O(log n) < O(n) < O(n log n) < O(n²) < O(2ⁿ) < O(n!)

  3. Drop constants and lower-order terms in Big O

  4. Sorting: O(n²) for simple, O(n log n) for efficient

  5. Searching: Binary search requires sorted data, gives O(log n)

  6. Recursion: Elegant but watch for exponential time and stack space

  7. Dynamic programming: Memoization (top-down) or tabulation (bottom-up)

  8. Graphs: DFS uses stack, BFS uses queue, both O(V + E)

  9. Space complexity matters for embedded systems and recursion limits

  10. Choose the right algorithm for your constraints (time, space, stability)


Next Steps#

  1. Practice on LeetCode/HackerRank: Apply concepts to real problems

  2. Study advanced topics: AVL trees, Red-Black trees, Tries

  3. Learn approximation algorithms: For NP-hard problems

  4. Explore parallel algorithms: Utilize multiple cores

  5. Read “Introduction to Algorithms”: CLRS textbook

  6. Implement from scratch: Best way to truly understand

Resources:


Continue to Deep Learning and Neural Networks to apply these optimization concepts to ML!