Classical ML in System Design

Explore how to select and justify classical machine learning models such as logistic regression and gradient boosted decision trees for scalable, production-ready systems. Understand their strengths in latency, interpretability, and data efficiency, and learn to apply these models appropriately in system design interviews and real-world applications.

We'll cover the following...

Logistic regression at scale for CTR
- Serving characteristics that matter
- Online learning with FTRL
Gradient boosted trees for tabular ranking
- Architectural strengths
- XGBoost vs. LightGBM
When classical ML outperforms neural approaches
Bridging classical ML to neural architectures

When an interviewer asks you to design an ad CTR prediction system serving 100K QPS with a 10 ms latency SLA, a transformer is usually not the right default choice. A stronger starting point is often logistic regression or a gradient-boosted decision tree. This is not a downgrade in this setting. It can be the right design choice when latency, throughput, and cost dominate. A strong candidate can defend that choice with quantified trade-offs around latency, throughput, model quality, and operational cost.

The previous lesson introduced the baseline-first principle and the Pareto front for model selection. This lesson goes deeper into the classical models themselves, treating logistic regression and gradient boosted trees as production-grade systems rather than stepping stones to deep learning. These two model families power billions of daily predictions at Meta, Google, Uber, and Airbnb. By the end of this lesson, you will have a repeatable framework for defending classical ML choices in interviews with specific, quantified reasoning, and you will understand exactly when these models outperform neural approaches on the axes that matter.

Logistic regression at scale for CTR

Logistic regression remains the backbone of CTR prediction at companies like Google, whose seminal 2013 ad click prediction paper demonstrated that a well-engineered logistic regression model could match or exceed more complex alternatives in production. The reasons are architectural, not nostalgic.

Serving characteristics that matter

A logistic regression model computes a prediction as a single dot product followed by a sigmoid activation, $\hat{y} = \sigma(\mathbf{w}^T \mathbf{x} + b)$ ...

1.The Interview Framework and Communication

2.Problem Formulation and Requirements

3.Data Strategy: Collection, Pipelines, and Features

4.Model Design and Architecture Selection

5.Evaluation: Offline, Online, and Fairness

6.Serving, Deployment, and MLOps

7.Case Study: Video Recommendation System

8.Case Study: Social Feed Ranking System

9.Case Study: Ad Click-Through Rate Prediction System

Mock Interview

10.Case Study: Semantic Search Engine

11.Case Study: Content Moderation System

Mock Interview

12.Case Study: Object Detection System

Mock Interview

13.Case Study: Visual Search System

Mock Interview

14.Case Study: Fraud Detection System

Mock Interview

15.Case Study: RAG-Based Enterprise Knowledge Assistant

16.Case Study: LLM-Powered Code Generation Tool

Classical ML in System Design

Logistic regression at scale for CTR

Serving characteristics that matter