Ad CTR Prediction: Problem Framing & Requirements
Explore how to effectively frame the ad click-through rate prediction problem by anchoring your design in auction principles, business success metrics, and strict latency requirements. This lesson guides you through balancing accuracy, calibration, and engineering constraints to build scalable, production-ready ML systems for ad ranking.
Every time a user scrolls through a feed on Meta, searches on Google, or swipes on TikTok, an ad auction fires in single-digit milliseconds. The ML model powering that auction predicts whether the user will click a given ad. A fraction-of-a-percent improvement in that prediction shifts billions of dollars in annual revenue, determines whether advertisers stay or leave the platform, and shapes the quality of every user’s experience. This is why ad click-through rate (CTR) prediction is one of the most frequently asked ML system design problems at MAANG companies.
The problem touches every pillar of system design simultaneously. Data pipelines must ingest billions of impression logs daily. Feature engineering must handle sparse, high-cardinality categorical data. Model architectures like Wide & Deep and DeepFM must balance memorization with generalization. Serving infrastructure must return predictions under brutal latency constraints. And continuous training pipelines must keep the model fresh as user behavior shifts hour by hour.
This lesson focuses on the critical first step that separates strong interview candidates from weak ones: problem framing and requirements. Before proposing any model or architecture, you need to anchor your design in three things: the auction math, the business metrics, and the latency constraint. Let’s build that foundation.
The eCPM equation and auction mechanics
The entire ad ranking system rests on a single equation:
The platform receives an ad request and retrieves a set of candidate ads. Each candidate has a bid set by the advertiser and a predicted click-through rate (
A miscalibrated