Search⌘ K
AI Features

Ad CTR Prediction: Data Strategy & Feature Engineering

Explore effective data strategies and feature engineering for ad click-through rate prediction systems. Understand how to handle high-cardinality user and ad features with embeddings and compression techniques. Learn the real-time versus pre-computed feature split for latency-sensitive serving and perform key storage and throughput estimations to design scalable, production-ready systems.

In a MAANG system design interview, proposing a transformer-based ranking model for ad CTR prediction earns you a nod. But the follow-up question, “How do you serve user features for a billion users under 20 milliseconds?” is where most candidates stumble. The model architecture is only as good as the features it consumes, and in CTR prediction, feature engineering is the design problem that separates L4 answers from Staff-level ones.

The previous lesson established the eCPM equation (eCPM=pCTR×bideCPM = pCTR \times bid), defined business metrics, and locked in the sub-100ms end-to-end latency constraint. With that foundation set, the focus now shifts to the raw material powering pCTR: features. The CTR model must capture who the user is, what the ad is, when and where the impression occurs, and how these dimensions interact, all within a 10–20ms inference window. Four feature families organize this complexity: user, ad, context, and cross features. This lesson defines each family, tackles the high-cardinality categorical challenge with embeddings, designs the real-time vs. pre-computed feature split, and works through back-of-the-envelope estimates for storage and QPS.

Attention: A candidate who proposes a sophisticated model architecture but hand-waves feature design will fail at L5+ rounds. Interviewers probe how features are sourced, stored, and served under latency constraints, not just what the model looks like.

The four feature families

CTR prediction draws signal from four distinct families, each capturing a different dimension of the ad impression event. Think of it like a restaurant recommendation: you need to know the diner’s preferences (user), the dish being offered (ad), the time and setting of the meal (context), and whether this particular diner has enjoyed similar dishes before (cross features).

The following families form the organizing taxonomy for every CTR feature engineering discussion:

  • User features: These encode who is seeing the ad. Demographic attributes like age bucket, country, and device type provide static context. Behavioral signals such as historical CTR, category affinity scores, and session depth capture engagement patterns. Long-term signals like 30-day click-through rates on specific ad categories reveal stable preferences.

  • Ad features: These describe what is being shown. Creative metadata includes ad format, whether the creative is image or video, and text length. Advertiser category, historical ad-level CTR, bid amount, and campaign age round out the ad’s profile.

  • Context features: These capture when and where the ...