Ad CTR Prediction: Data Strategy & Feature Engineering

Explore effective data strategies and feature engineering for ad click-through rate prediction systems. Understand how to handle high-cardinality user and ad features with embeddings and compression techniques. Learn the real-time versus pre-computed feature split for latency-sensitive serving and perform key storage and throughput estimations to design scalable, production-ready systems.

We'll cover the following...

The four feature families
Handling sparse high-cardinality features
- Embedding representations
  - Compression techniques for production
Real-time vs. pre-computed feature split
- Pre-computed features
- Real-time features
Back-of-the-envelope estimates
- Storage estimates
- QPS and throughput
Preparing for model architecture

In a MAANG system design interview, proposing a transformer-based ranking model for ad CTR prediction earns you a nod. But the follow-up question, “How do you serve user features for a billion users under 20 milliseconds?” is where most candidates stumble. The model architecture is only as good as the features it consumes, and in CTR prediction, feature engineering is the design problem that separates L4 answers from Staff-level ones.

The previous lesson established the eCPM equation ( $eCPM = pCTR \times bid$ ), defined business metrics, and locked in the sub-100ms end-to-end latency constraint. With that foundation set, the focus now shifts to the raw material powering pCTR: features. The CTR model must capture who the user is, what the ad is, when and where the impression occurs, and how these dimensions interact, all within a 10–20ms inference window. Four feature families organize this complexity: user, ad, context, and cross features. This lesson defines each family, tackles the high-cardinality categorical challenge with embeddings, designs the real-time vs. pre-computed feature split, and works through back-of-the-envelope estimates for storage and QPS.

Attention: A candidate who proposes a sophisticated model architecture but hand-waves feature design will fail at L5+ rounds. Interviewers probe how features are sourced, stored, and served under latency constraints, not just what the model looks like.

The four feature families

CTR prediction draws signal from four distinct families, each capturing a different dimension of the ad impression event. Think of it like a restaurant recommendation: you need to know the diner’s preferences (user), the dish being offered (ad), the time and setting of the meal (context), and whether this particular diner has enjoyed similar dishes before (cross features).

The following families form the organizing taxonomy for every CTR feature engineering discussion:

User features: These encode who is seeing the ad. Demographic attributes like age bucket, country, and device type provide static context. Behavioral signals such as historical CTR, category affinity scores, and session depth capture engagement patterns. Long-term signals like 30-day click-through rates on specific ad categories reveal stable preferences.
Ad features: These describe what is being shown. Creative metadata includes ad format, whether the creative is image or video, and text length. Advertiser category, historical ad-level CTR, bid amount, and campaign age round out the ad’s profile.
Context features: These capture when and where the ...

1.The Interview Framework and Communication

2.Problem Formulation and Requirements

3.Data Strategy: Collection, Pipelines, and Features

4.Model Design and Architecture Selection

5.Evaluation: Offline, Online, and Fairness

6.Serving, Deployment, and MLOps

7.Case Study: Video Recommendation System

8.Case Study: Social Feed Ranking System

9.Case Study: Ad Click-Through Rate Prediction System

Mock Interview

10.Case Study: Semantic Search Engine

11.Case Study: Content Moderation System

Mock Interview

12.Case Study: Object Detection System

Mock Interview

13.Case Study: Visual Search System

Mock Interview

14.Case Study: Fraud Detection System

Mock Interview

15.Case Study: RAG-Based Enterprise Knowledge Assistant

16.Case Study: LLM-Powered Code Generation Tool

Ad CTR Prediction: Data Strategy & Feature Engineering

The four feature families