Lesson 8: Classic Problems Collection#

This notebook contains legendary problems that every advanced programmer should know. These problems appear in:

  • Technical interviews (FAANG companies)

  • Competitive programming (Codeforces, LeetCode, TopCoder)

  • Computer science literature

  • Real-world applications

What You’ll Learn#

  • Classic algorithm problems and optimal solutions

  • LeetCode-style interview problems

  • Cryptography fundamentals

  • Machine learning from scratch

  • Problem-solving patterns and techniques

Prerequisites#

  • Strong Python fundamentals

  • Understanding of algorithms and data structures

  • Big O notation

  • Object-oriented programming


🎯 Problem-Solving Framework#

For each problem:

  1. Understand: Read carefully, identify inputs/outputs

  2. Examples: Work through examples by hand

  3. Brute Force: Think of the simplest solution first

  4. Optimize: Can you do better? Trade-offs?

  5. Code: Implement with clean, readable code

  6. Test: Edge cases, performance

  7. Analyze: Time and space complexity


Part 1: Classic LeetCode Problems#

These are the most famous interview problems. Mastering these patterns will help you solve hundreds of similar problems.

Problem 1: Two Sum (LeetCode #1)#

Difficulty: Easy
Pattern: Hash Map
Companies: Amazon, Google, Facebook, Apple

Problem: Given an array of integers nums and an integer target, return indices of the two numbers that add up to target.

Example:

Input: nums = [2, 7, 11, 15], target = 9
Output: [0, 1] (because nums[0] + nums[1] = 2 + 7 = 9)

Constraints:

  • Each input has exactly one solution

  • You cannot use the same element twice

def two_sum_brute(nums: list[int], target: int) -> list[int]:
    """
    Brute force: Check all pairs
    Time: O(n²), Space: O(1)
    """
    for i in range(len(nums)):
        for j in range(i + 1, len(nums)):
            if nums[i] + nums[j] == target:
                return [i, j]
    return []

def two_sum_optimal(nums: list[int], target: int) -> list[int]:
    """
    Hash map: Trade space for time
    Time: O(n), Space: O(n)
    """
    seen = {}  # value -> index
    for i, num in enumerate(nums):
        complement = target - num
        if complement in seen:
            return [seen[complement], i]
        seen[num] = i
    return []

# Test
test_cases = [
    ([2, 7, 11, 15], 9, [0, 1]),
    ([3, 2, 4], 6, [1, 2]),
    ([3, 3], 6, [0, 1]),
]

for nums, target, expected in test_cases:
    result = two_sum_optimal(nums, target)
    print(f"✓" if result == expected else "✗", f"two_sum({nums}, {target}) = {result}")

💡 Key Insight#

The hash map pattern: Instead of searching for complement in remaining array (O(n)), store seen values in hash map (O(1) lookup).

Problem 2: Three Sum (LeetCode #15)#

Difficulty: Medium
Pattern: Two Pointers
Companies: Amazon, Facebook, Microsoft

Problem: Given an array nums, find all unique triplets [nums[i], nums[j], nums[k]] such that i != j != k and nums[i] + nums[j] + nums[k] == 0.

Example:

Input: nums = [-1, 0, 1, 2, -1, -4]
Output: [[-1, -1, 2], [-1, 0, 1]]
def three_sum(nums: list[int]) -> list[list[int]]:
    """
    Sort + Two Pointers
    Time: O(n²), Space: O(1) (excluding output)
    """
    nums.sort()
    result = []
    
    for i in range(len(nums) - 2):
        # Skip duplicates for first number
        if i > 0 and nums[i] == nums[i - 1]:
            continue
        
        # Two pointers for remaining sum
        left, right = i + 1, len(nums) - 1
        target = -nums[i]
        
        while left < right:
            current_sum = nums[left] + nums[right]
            
            if current_sum == target:
                result.append([nums[i], nums[left], nums[right]])
                
                # Skip duplicates
                while left < right and nums[left] == nums[left + 1]:
                    left += 1
                while left < right and nums[right] == nums[right - 1]:
                    right -= 1
                
                left += 1
                right -= 1
            elif current_sum < target:
                left += 1
            else:
                right -= 1
    
    return result

# Test
print(three_sum([-1, 0, 1, 2, -1, -4]))  # [[-1, -1, 2], [-1, 0, 1]]
print(three_sum([0, 1, 1]))  # []
print(three_sum([0, 0, 0]))  # [[0, 0, 0]]

Problem 3: Longest Substring Without Repeating Characters (LeetCode #3)#

Difficulty: Medium
Pattern: Sliding Window
Companies: Amazon, Bloomberg, Adobe

Problem: Find the length of the longest substring without repeating characters.

Example:

Input: s = "abcabcbb"
Output: 3 ("abc")
def length_of_longest_substring(s: str) -> int:
    """
    Sliding Window with Hash Set
    Time: O(n), Space: O(min(m, n)) where m is charset size
    """
    char_set = set()
    left = 0
    max_length = 0
    
    for right in range(len(s)):
        # Shrink window until no duplicates
        while s[right] in char_set:
            char_set.remove(s[left])
            left += 1
        
        char_set.add(s[right])
        max_length = max(max_length, right - left + 1)
    
    return max_length

# Test
test_cases = [
    ("abcabcbb", 3),
    ("bbbbb", 1),
    ("pwwkew", 3),
    ("", 0),
]

for s, expected in test_cases:
    result = length_of_longest_substring(s)
    print(f"✓" if result == expected else "✗", f"length_of_longest_substring('{s}') = {result}")

Problem 4: Merge K Sorted Lists (LeetCode #23)#

Difficulty: Hard
Pattern: Heap / Priority Queue
Companies: Google, Facebook, Amazon

Problem: Merge k sorted linked lists into one sorted list.

Example:

Input: [[1,4,5], [1,3,4], [2,6]]
Output: [1,1,2,3,4,4,5,6]
import heapq
from typing import Optional

class ListNode:
    def __init__(self, val=0, next=None):
        self.val = val
        self.next = next
    
    def __lt__(self, other):
        return self.val < other.val

def merge_k_lists(lists: list[Optional[ListNode]]) -> Optional[ListNode]:
    """
    Min Heap approach
    Time: O(N log k) where N is total nodes, k is number of lists
    Space: O(k) for the heap
    """
    heap = []
    
    # Add first node from each list
    for i, node in enumerate(lists):
        if node:
            heapq.heappush(heap, (node.val, i, node))
    
    dummy = ListNode(0)
    current = dummy
    
    while heap:
        val, i, node = heapq.heappop(heap)
        current.next = node
        current = current.next
        
        if node.next:
            heapq.heappush(heap, (node.next.val, i, node.next))
    
    return dummy.next

# Helper functions
def create_list(arr):
    dummy = ListNode(0)
    current = dummy
    for val in arr:
        current.next = ListNode(val)
        current = current.next
    return dummy.next

def list_to_array(head):
    result = []
    while head:
        result.append(head.val)
        head = head.next
    return result

# Test
lists = [create_list([1,4,5]), create_list([1,3,4]), create_list([2,6])]
merged = merge_k_lists(lists)
print(f"Merged: {list_to_array(merged)}")

Problem 5: Trapping Rain Water (LeetCode #42)#

Difficulty: Hard
Pattern: Two Pointers
Companies: Google, Amazon, Bloomberg

Problem: Given heights representing an elevation map where width of each bar is 1, compute how much water can be trapped after raining.

Example:

Input: height = [0,1,0,2,1,0,1,3,2,1,2,1]
Output: 6
def trap(height: list[int]) -> int:
    """
    Two pointers approach
    Time: O(n), Space: O(1)
    
    Key insight: Water level at position i is determined by
    min(max_left, max_right) - height[i]
    """
    if not height:
        return 0
    
    left, right = 0, len(height) - 1
    left_max, right_max = height[left], height[right]
    water = 0
    
    while left < right:
        if height[left] < height[right]:
            left += 1
            left_max = max(left_max, height[left])
            water += left_max - height[left]
        else:
            right -= 1
            right_max = max(right_max, height[right])
            water += right_max - height[right]
    
    return water

# Test
test_cases = [
    ([0,1,0,2,1,0,1,3,2,1,2,1], 6),
    ([4,2,0,3,2,5], 9),
]

for height, expected in test_cases:
    result = trap(height)
    print(f"✓" if result == expected else "✗", f"trap({height}) = {result} (expected {expected})")

Part 2: Classic Dynamic Programming#

These are fundamental DP problems that form the basis of countless variations.

Problem 6: 0/1 Knapsack#

Difficulty: Hard
Pattern: Dynamic Programming
Classic: One of the most famous DP problems

Problem: Given weights and values of n items, put these items in a knapsack of capacity W to get the maximum total value.

Example:

values = [60, 100, 120]
weights = [10, 20, 30]
capacity = 50
Output: 220 (take items 2 and 3)
def knapsack(values: list[int], weights: list[int], capacity: int) -> int:
    """
    Classic 0/1 Knapsack with DP
    Time: O(n * W), Space: O(n * W)
    
    dp[i][w] = max value using first i items with capacity w
    """
    n = len(values)
    # Create DP table
    dp = [[0] * (capacity + 1) for _ in range(n + 1)]
    
    for i in range(1, n + 1):
        for w in range(capacity + 1):
            # Don't take item i-1
            dp[i][w] = dp[i-1][w]
            
            # Take item i-1 if possible
            if weights[i-1] <= w:
                dp[i][w] = max(dp[i][w], 
                              dp[i-1][w - weights[i-1]] + values[i-1])
    
    return dp[n][capacity]

def knapsack_optimized(values: list[int], weights: list[int], capacity: int) -> int:
    """
    Space-optimized: O(W) space
    """
    dp = [0] * (capacity + 1)
    
    for i in range(len(values)):
        # Traverse backwards to avoid using updated values
        for w in range(capacity, weights[i] - 1, -1):
            dp[w] = max(dp[w], dp[w - weights[i]] + values[i])
    
    return dp[capacity]

# Test
values = [60, 100, 120]
weights = [10, 20, 30]
capacity = 50
print(f"Knapsack: {knapsack(values, weights, capacity)}")
print(f"Knapsack (optimized): {knapsack_optimized(values, weights, capacity)}")

Problem 7: Longest Common Subsequence (LCS)#

Classic: Fundamental DP problem with many applications (diff tools, DNA sequencing)

Problem: Find the length of the longest subsequence common to both strings.

Example:

text1 = "abcde", text2 = "ace"
Output: 3 ("ace")
def longest_common_subsequence(text1: str, text2: str) -> int:
    """
    Classic LCS with DP
    Time: O(m * n), Space: O(m * n)
    """
    m, n = len(text1), len(text2)
    dp = [[0] * (n + 1) for _ in range(m + 1)]
    
    for i in range(1, m + 1):
        for j in range(1, n + 1):
            if text1[i-1] == text2[j-1]:
                dp[i][j] = dp[i-1][j-1] + 1
            else:
                dp[i][j] = max(dp[i-1][j], dp[i][j-1])
    
    return dp[m][n]

def lcs_with_string(text1: str, text2: str) -> str:
    """
    Return the actual LCS string
    """
    m, n = len(text1), len(text2)
    dp = [[0] * (n + 1) for _ in range(m + 1)]
    
    # Build DP table
    for i in range(1, m + 1):
        for j in range(1, n + 1):
            if text1[i-1] == text2[j-1]:
                dp[i][j] = dp[i-1][j-1] + 1
            else:
                dp[i][j] = max(dp[i-1][j], dp[i][j-1])
    
    # Backtrack to find the string
    result = []
    i, j = m, n
    while i > 0 and j > 0:
        if text1[i-1] == text2[j-1]:
            result.append(text1[i-1])
            i -= 1
            j -= 1
        elif dp[i-1][j] > dp[i][j-1]:
            i -= 1
        else:
            j -= 1
    
    return ''.join(reversed(result))

# Test
print(f"LCS length: {longest_common_subsequence('abcde', 'ace')}")
print(f"LCS string: '{lcs_with_string('abcde', 'ace')}'")
print(f"LCS of 'abc' and 'abc': {longest_common_subsequence('abc', 'abc')}")
print(f"LCS of 'abc' and 'def': {longest_common_subsequence('abc', 'def')}")

Problem 8: Edit Distance (Levenshtein Distance)#

Classic: Used in spell checkers, DNA analysis, plagiarism detection

Problem: Given two strings, find minimum number of operations (insert, delete, replace) to convert one to the other.

Example:

word1 = "horse", word2 = "ros"
Output: 3 (horse -> rorse -> rose -> ros)
def min_distance(word1: str, word2: str) -> int:
    """
    Edit Distance using DP
    Time: O(m * n), Space: O(m * n)
    """
    m, n = len(word1), len(word2)
    dp = [[0] * (n + 1) for _ in range(m + 1)]
    
    # Base cases
    for i in range(m + 1):
        dp[i][0] = i  # Delete all characters
    for j in range(n + 1):
        dp[0][j] = j  # Insert all characters
    
    # Fill DP table
    for i in range(1, m + 1):
        for j in range(1, n + 1):
            if word1[i-1] == word2[j-1]:
                dp[i][j] = dp[i-1][j-1]  # No operation needed
            else:
                dp[i][j] = 1 + min(
                    dp[i-1][j],    # Delete from word1
                    dp[i][j-1],    # Insert into word1
                    dp[i-1][j-1]   # Replace in word1
                )
    
    return dp[m][n]

# Test
test_cases = [
    ("horse", "ros", 3),
    ("intention", "execution", 5),
    ("", "abc", 3),
    ("abc", "abc", 0),
]

for word1, word2, expected in test_cases:
    result = min_distance(word1, word2)
    print(f"✓" if result == expected else "✗", 
          f"edit_distance('{word1}', '{word2}') = {result}")

Part 3: Classic Graph Algorithms#

Problem 9: Dijkstra’s Shortest Path#

Classic: Edsger Dijkstra, 1956
Applications: GPS navigation, network routing, airline routes

Problem: Find shortest path from source to all vertices in weighted graph.

import heapq
from collections import defaultdict
from typing import Dict, List, Tuple

def dijkstra(graph: Dict[int, List[Tuple[int, int]]], start: int) -> Dict[int, int]:
    """
    Dijkstra's algorithm using min-heap
    Time: O((V + E) log V), Space: O(V)
    
    Args:
        graph: Adjacency list {node: [(neighbor, weight), ...]}
        start: Starting node
    
    Returns:
        Dictionary {node: shortest_distance}
    """
    distances = {node: float('inf') for node in graph}
    distances[start] = 0
    
    # Min heap: (distance, node)
    pq = [(0, start)]
    visited = set()
    
    while pq:
        current_dist, current = heapq.heappop(pq)
        
        if current in visited:
            continue
        visited.add(current)
        
        # Check all neighbors
        for neighbor, weight in graph.get(current, []):
            distance = current_dist + weight
            
            if distance < distances[neighbor]:
                distances[neighbor] = distance
                heapq.heappush(pq, (distance, neighbor))
    
    return distances

# Test with example graph
graph = {
    0: [(1, 4), (2, 1)],
    1: [(3, 1)],
    2: [(1, 2), (3, 5)],
    3: [],
}

distances = dijkstra(graph, 0)
print("Shortest distances from node 0:")
for node, dist in sorted(distances.items()):
    print(f"  Node {node}: {dist}")

Problem 10: Detect Cycle in Directed Graph#

Classic: Fundamental graph problem
Applications: Deadlock detection, dependency resolution

Problem: Detect if a directed graph contains a cycle.

def has_cycle_dfs(graph: Dict[int, List[int]]) -> bool:
    """
    Detect cycle using DFS with coloring
    White (0) = unvisited, Gray (1) = in progress, Black (2) = done
    Time: O(V + E)
    """
    color = {node: 0 for node in graph}
    
    def dfs(node):
        if color[node] == 1:  # Gray = cycle found
            return True
        if color[node] == 2:  # Black = already processed
            return False
        
        color[node] = 1  # Mark as in progress
        
        for neighbor in graph.get(node, []):
            if dfs(neighbor):
                return True
        
        color[node] = 2  # Mark as done
        return False
    
    for node in graph:
        if color[node] == 0:
            if dfs(node):
                return True
    
    return False

# Test
graph_with_cycle = {
    0: [1],
    1: [2],
    2: [0],  # Cycle: 0 -> 1 -> 2 -> 0
}

graph_without_cycle = {
    0: [1, 2],
    1: [3],
    2: [3],
    3: [],
}

print(f"Graph 1 has cycle: {has_cycle_dfs(graph_with_cycle)}")
print(f"Graph 2 has cycle: {has_cycle_dfs(graph_without_cycle)}")

Part 4: Classic Cryptography#

Problem 11: Caesar Cipher#

Classic: Used by Julius Caesar (58-51 BC)
Type: Substitution cipher

Problem: Shift each letter by a fixed number of positions in the alphabet.

def caesar_encrypt(text: str, shift: int) -> str:
    """
    Caesar cipher encryption
    Time: O(n), Space: O(n)
    """
    result = []
    
    for char in text:
        if char.isalpha():
            # Determine if uppercase or lowercase
            start = ord('A') if char.isupper() else ord('a')
            # Shift and wrap around
            shifted = (ord(char) - start + shift) % 26
            result.append(chr(start + shifted))
        else:
            result.append(char)
    
    return ''.join(result)

def caesar_decrypt(text: str, shift: int) -> str:
    """Decrypt by shifting backwards"""
    return caesar_encrypt(text, -shift)

def caesar_crack(ciphertext: str) -> list[tuple[int, str]]:
    """
    Brute force attack: try all 26 possible shifts
    """
    results = []
    for shift in range(26):
        plaintext = caesar_decrypt(ciphertext, shift)
        results.append((shift, plaintext))
    return results

# Test
plaintext = "HELLO WORLD"
shift = 3
encrypted = caesar_encrypt(plaintext, shift)
decrypted = caesar_decrypt(encrypted, shift)

print(f"Plaintext:  {plaintext}")
print(f"Encrypted:  {encrypted}")
print(f"Decrypted:  {decrypted}")
print(f"\nBrute force attack (first 5 attempts):")
for shift, text in caesar_crack(encrypted)[:5]:
    print(f"  Shift {shift:2d}: {text}")

Problem 12: Vigenère Cipher#

Classic: 16th century, considered unbreakable for 300 years
Type: Polyalphabetic substitution

Problem: Use a keyword to shift letters by different amounts.

def vigenere_encrypt(plaintext: str, key: str) -> str:
    """
    Vigenère cipher encryption
    Time: O(n), Space: O(n)
    """
    result = []
    key = key.upper()
    key_index = 0
    
    for char in plaintext:
        if char.isalpha():
            # Get shift amount from key
            shift = ord(key[key_index % len(key)]) - ord('A')
            
            # Apply shift
            start = ord('A') if char.isupper() else ord('a')
            shifted = (ord(char) - start + shift) % 26
            result.append(chr(start + shifted))
            
            key_index += 1
        else:
            result.append(char)
    
    return ''.join(result)

def vigenere_decrypt(ciphertext: str, key: str) -> str:
    """
    Vigenère cipher decryption
    """
    result = []
    key = key.upper()
    key_index = 0
    
    for char in ciphertext:
        if char.isalpha():
            # Get shift amount from key
            shift = ord(key[key_index % len(key)]) - ord('A')
            
            # Apply reverse shift
            start = ord('A') if char.isupper() else ord('a')
            shifted = (ord(char) - start - shift) % 26
            result.append(chr(start + shifted))
            
            key_index += 1
        else:
            result.append(char)
    
    return ''.join(result)

# Test
plaintext = "ATTACK AT DAWN"
key = "LEMON"
encrypted = vigenere_encrypt(plaintext, key)
decrypted = vigenere_decrypt(encrypted, key)

print(f"Plaintext:  {plaintext}")
print(f"Key:        {key}")
print(f"Encrypted:  {encrypted}")
print(f"Decrypted:  {decrypted}")

Problem 13: RSA Encryption (Simplified)#

Classic: Rivest-Shamir-Adleman, 1977
Type: Asymmetric (public-key) cryptography
Foundation: Modern internet security (HTTPS, SSH, etc.)

Concept: Based on the difficulty of factoring large prime numbers.

Note: This is a simplified educational version. Real RSA uses much larger primes!

import random

def gcd(a: int, b: int) -> int:
    """Greatest Common Divisor using Euclidean algorithm"""
    while b:
        a, b = b, a % b
    return a

def mod_inverse(e: int, phi: int) -> int:
    """
    Find modular multiplicative inverse using Extended Euclidean Algorithm
    Returns d such that (e * d) % phi = 1
    """
    def extended_gcd(a, b):
        if a == 0:
            return b, 0, 1
        gcd_val, x1, y1 = extended_gcd(b % a, a)
        x = y1 - (b // a) * x1
        y = x1
        return gcd_val, x, y
    
    _, x, _ = extended_gcd(e, phi)
    return (x % phi + phi) % phi

def generate_rsa_keys(p: int, q: int) -> tuple[tuple[int, int], tuple[int, int]]:
    """
    Generate RSA key pair from two prime numbers
    
    Returns:
        public_key (e, n), private_key (d, n)
    """
    # Step 1: Calculate n
    n = p * q
    
    # Step 2: Calculate phi (Euler's totient)
    phi = (p - 1) * (q - 1)
    
    # Step 3: Choose e (public exponent)
    # Commonly 65537, but we'll choose a small coprime for demonstration
    e = 65537 if 65537 < phi else 3
    while gcd(e, phi) != 1:
        e += 2
    
    # Step 4: Calculate d (private exponent)
    d = mod_inverse(e, phi)
    
    return (e, n), (d, n)

def rsa_encrypt(message: int, public_key: tuple[int, int]) -> int:
    """
    Encrypt a number using RSA
    C = M^e mod n
    """
    e, n = public_key
    return pow(message, e, n)

def rsa_decrypt(ciphertext: int, private_key: tuple[int, int]) -> int:
    """
    Decrypt using RSA
    M = C^d mod n
    """
    d, n = private_key
    return pow(ciphertext, d, n)

# Test with small primes (educational purposes only!)
p, q = 61, 53  # Two prime numbers
public_key, private_key = generate_rsa_keys(p, q)

print(f"Prime p: {p}")
print(f"Prime q: {q}")
print(f"n (p*q): {p*q}")
print(f"Public key:  {public_key}")
print(f"Private key: {private_key}")
print()

# Encrypt and decrypt a message
message = 42
encrypted = rsa_encrypt(message, public_key)
decrypted = rsa_decrypt(encrypted, private_key)

print(f"Original message: {message}")
print(f"Encrypted:        {encrypted}")
print(f"Decrypted:        {decrypted}")
print(f"\n✓ RSA works!" if message == decrypted else "✗ RSA failed")

🔐 RSA Key Insights#

  1. Public Key Security: Anyone can encrypt with public key (e, n), but only private key (d, n) can decrypt

  2. Mathematical Foundation: Security relies on difficulty of factoring n = p * q when p and q are large primes

  3. Real-world: Modern RSA uses 2048-4096 bit keys (617-1234 digit numbers!)

  4. Applications: HTTPS, SSH, email encryption, digital signatures


Part 5: Classic Machine Learning from Scratch#

Implement fundamental ML algorithms without libraries to understand how they really work.

Problem 14: K-Nearest Neighbors from Scratch#

Classic: One of the simplest ML algorithms
Type: Instance-based learning

Problem: Classify a point based on the majority class of its k nearest neighbors.

import numpy as np
from collections import Counter

class KNN:
    """
    K-Nearest Neighbors from scratch
    """
    def __init__(self, k: int = 3):
        self.k = k
        self.X_train = None
        self.y_train = None
    
    def fit(self, X: np.ndarray, y: np.ndarray):
        """Store training data (lazy learning)"""
        self.X_train = X
        self.y_train = y
    
    def _euclidean_distance(self, x1: np.ndarray, x2: np.ndarray) -> float:
        """Calculate Euclidean distance between two points"""
        return np.sqrt(np.sum((x1 - x2) ** 2))
    
    def predict(self, X: np.ndarray) -> np.ndarray:
        """Predict class labels for X"""
        predictions = [self._predict_single(x) for x in X]
        return np.array(predictions)
    
    def _predict_single(self, x: np.ndarray) -> int:
        """Predict class for a single sample"""
        # Calculate distances to all training samples
        distances = [self._euclidean_distance(x, x_train) 
                    for x_train in self.X_train]
        
        # Get k nearest neighbors
        k_indices = np.argsort(distances)[:self.k]
        k_nearest_labels = [self.y_train[i] for i in k_indices]
        
        # Return most common class
        most_common = Counter(k_nearest_labels).most_common(1)
        return most_common[0][0]

# Test on classic Iris dataset
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Load data
iris = load_iris()
X_train, X_test, y_train, y_test = train_test_split(
    iris.data, iris.target, test_size=0.2, random_state=42
)

# Train and test our KNN
knn = KNN(k=3)
knn.fit(X_train, y_train)
predictions = knn.predict(X_test)

accuracy = accuracy_score(y_test, predictions)
print(f"KNN from scratch - Accuracy: {accuracy:.2%}")
print(f"\nFirst 10 predictions:")
print(f"Predicted: {predictions[:10]}")
print(f"Actual:    {y_test[:10]}")

Problem 15: Linear Regression from Scratch#

Classic: Carl Friedrich Gauss, 1794
Foundation: Basis for many ML algorithms

Problem: Find the best-fit line: y = mx + b

class LinearRegression:
    """
    Linear Regression using Gradient Descent
    """
    def __init__(self, learning_rate: float = 0.01, n_iterations: int = 1000):
        self.lr = learning_rate
        self.n_iterations = n_iterations
        self.weights = None
        self.bias = None
        self.losses = []
    
    def fit(self, X: np.ndarray, y: np.ndarray):
        """
        Train using gradient descent
        
        Gradient descent update rules:
        w = w - α * (∂L/∂w)
        b = b - α * (∂L/∂b)
        """
        n_samples, n_features = X.shape
        
        # Initialize parameters
        self.weights = np.zeros(n_features)
        self.bias = 0
        
        # Gradient descent
        for _ in range(self.n_iterations):
            # Forward pass
            y_predicted = np.dot(X, self.weights) + self.bias
            
            # Compute loss (MSE)
            loss = np.mean((y - y_predicted) ** 2)
            self.losses.append(loss)
            
            # Compute gradients
            dw = (2 / n_samples) * np.dot(X.T, (y_predicted - y))
            db = (2 / n_samples) * np.sum(y_predicted - y)
            
            # Update parameters
            self.weights -= self.lr * dw
            self.bias -= self.lr * db
    
    def predict(self, X: np.ndarray) -> np.ndarray:
        """Make predictions"""
        return np.dot(X, self.weights) + self.bias

# Test with synthetic data
np.random.seed(42)
X = 2 * np.random.rand(100, 1)
y = 4 + 3 * X.squeeze() + np.random.randn(100)  # y = 4 + 3x + noise

# Split data
split = int(0.8 * len(X))
X_train, X_test = X[:split], X[split:]
y_train, y_test = y[:split], y[split:]

# Train
model = LinearRegression(learning_rate=0.1, n_iterations=1000)
model.fit(X_train, y_train)

# Evaluate
y_pred = model.predict(X_test)
mse = np.mean((y_test - y_pred) ** 2)
r2 = 1 - (np.sum((y_test - y_pred) ** 2) / np.sum((y_test - np.mean(y_test)) ** 2))

print(f"Linear Regression from scratch:")
print(f"True equation: y = 4 + 3x")
print(f"Learned equation: y = {model.bias:.2f} + {model.weights[0]:.2f}x")
print(f"MSE: {mse:.4f}")
print(f"R²: {r2:.4f}")
print(f"Final loss: {model.losses[-1]:.4f}")

Problem 16: Neural Network from Scratch#

Classic: Backpropagation algorithm (Rumelhart et al., 1986)
Foundation: Deep learning

Problem: Implement a simple feedforward neural network with backpropagation.

class NeuralNetwork:
    """
    Simple feedforward neural network with one hidden layer
    """
    def __init__(self, input_size: int, hidden_size: int, output_size: int, 
                 learning_rate: float = 0.1):
        self.lr = learning_rate
        
        # Initialize weights with Xavier initialization
        self.W1 = np.random.randn(input_size, hidden_size) * np.sqrt(2.0 / input_size)
        self.b1 = np.zeros((1, hidden_size))
        self.W2 = np.random.randn(hidden_size, output_size) * np.sqrt(2.0 / hidden_size)
        self.b2 = np.zeros((1, output_size))
        
        self.losses = []
    
    def sigmoid(self, x: np.ndarray) -> np.ndarray:
        """Sigmoid activation function"""
        return 1 / (1 + np.exp(-np.clip(x, -500, 500)))
    
    def sigmoid_derivative(self, x: np.ndarray) -> np.ndarray:
        """Derivative of sigmoid"""
        return x * (1 - x)
    
    def forward(self, X: np.ndarray) -> tuple[np.ndarray, np.ndarray, np.ndarray]:
        """
        Forward propagation
        Returns: hidden_output, final_output, input
        """
        # Hidden layer
        z1 = np.dot(X, self.W1) + self.b1
        a1 = self.sigmoid(z1)
        
        # Output layer
        z2 = np.dot(a1, self.W2) + self.b2
        a2 = self.sigmoid(z2)
        
        return a1, a2, X
    
    def backward(self, X: np.ndarray, y: np.ndarray, a1: np.ndarray, a2: np.ndarray):
        """
        Backpropagation
        """
        m = X.shape[0]
        
        # Output layer gradients
        dz2 = a2 - y
        dW2 = (1 / m) * np.dot(a1.T, dz2)
        db2 = (1 / m) * np.sum(dz2, axis=0, keepdims=True)
        
        # Hidden layer gradients
        dz1 = np.dot(dz2, self.W2.T) * self.sigmoid_derivative(a1)
        dW1 = (1 / m) * np.dot(X.T, dz1)
        db1 = (1 / m) * np.sum(dz1, axis=0, keepdims=True)
        
        # Update weights
        self.W2 -= self.lr * dW2
        self.b2 -= self.lr * db2
        self.W1 -= self.lr * dW1
        self.b1 -= self.lr * db1
    
    def train(self, X: np.ndarray, y: np.ndarray, epochs: int = 1000):
        """
        Train the network
        """
        for epoch in range(epochs):
            # Forward pass
            a1, a2, _ = self.forward(X)
            
            # Compute loss (MSE)
            loss = np.mean((y - a2) ** 2)
            self.losses.append(loss)
            
            # Backward pass
            self.backward(X, y, a1, a2)
            
            if epoch % 100 == 0:
                print(f"Epoch {epoch}, Loss: {loss:.6f}")
    
    def predict(self, X: np.ndarray) -> np.ndarray:
        """Make predictions"""
        _, a2, _ = self.forward(X)
        return a2

# Test on XOR problem (classic non-linearly separable problem)
print("Training on XOR problem:")
print("="*50)

# XOR dataset
X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
y = np.array([[0], [1], [1], [0]])

# Create and train network
nn = NeuralNetwork(input_size=2, hidden_size=4, output_size=1, learning_rate=0.5)
nn.train(X, y, epochs=5000)

# Test
print("\nPredictions:")
predictions = nn.predict(X)
for i in range(len(X)):
    print(f"Input: {X[i]} -> Predicted: {predictions[i][0]:.4f}, Actual: {y[i][0]}")

📝 Exercise Challenges#

Challenge 1: Four Sum (LeetCode #18)#

Extend Three Sum to find all unique quadruplets that sum to target.

Challenge 2: A* Search Algorithm#

Implement A* pathfinding (Dijkstra with heuristic). Classic in game AI and robotics.

Challenge 3: Implement SHA-256#

Implement the SHA-256 hash function (foundation of Bitcoin and blockchain).

Challenge 4: Decision Tree from Scratch#

Implement a decision tree classifier with entropy/gini impurity.

Challenge 5: Topological Sort#

Implement Kahn’s algorithm for topological sorting (dependency resolution).


🎯 Problem-Solving Patterns Learned#

  1. Hash Map Pattern: Trade space for time (Two Sum)

  2. Two Pointers: Reduce O(n²) to O(n) (Three Sum, Trapping Rain Water)

  3. Sliding Window: Efficient substring problems

  4. Heap/Priority Queue: K-way merge, top-K problems

  5. Dynamic Programming: Break into subproblems (Knapsack, LCS, Edit Distance)

  6. Graph Algorithms: BFS, DFS, Dijkstra, cycle detection

  7. Gradient Descent: Foundation of ML optimization

  8. Backpropagation: Training neural networks


📚 Further Practice#

Online Judges#

  • LeetCode: 2000+ problems with company tags

  • Codeforces: Competitive programming

  • HackerRank: Skills certification

  • Project Euler: Math + programming

Books#

  • Cracking the Coding Interview by Gayle Laakmann McDowell

  • Elements of Programming Interviews by Aziz, Lee, Prakash

  • Introduction to Algorithms (CLRS)

Cryptography#

  • The Code Book by Simon Singh (history)

  • Cryptography Engineering by Ferguson, Schneier, Kohno

  • CryptoHack.org (interactive challenges)


🏆 Mastery Checklist#

You’ve mastered this lesson when you can:

  • Solve LeetCode mediums in 20-30 minutes

  • Recognize patterns instantly (“This is a two-pointer problem”)

  • Explain time/space complexity clearly

  • Implement classic algorithms from memory

  • Debug with test cases systematically

  • Understand cryptography fundamentals

  • Implement ML algorithms without libraries

  • Optimize brute force to optimal solutions

Remember: These problems took decades to solve originally. Don’t get discouraged - everyone struggles at first. The key is consistent practice! 🚀