Metric Guardrails and Cannibalization

Explore how metric cannibalization occurs when optimizing one metric negatively affects another critical metric in ML systems. Learn to identify common patterns, design guardrail metrics with thresholds, and apply strategies like constrained optimization to balance competing objectives in production environments.

We'll cover the following...

Why metric cannibalization matters
Classic cannibalization patterns
Designing guardrail metrics
- The metric hierarchy
- The three-step guardrail design process
Handling metric conflicts in production
- Resolution strategies at the MAANG scale
Conclusion

A recommendation model is deployed to production, CTR increases by 3%, and the launch initially looks successful. A week later, the retention dashboard shows a 0.8% drop in Day-7 return rate. Users are clicking more often but returning less often. The model optimized the target it was given, but the loss function did not capture the downstream retention impact. This is an objective misalignment problem, and recognizing it is an important signal in senior ML system design interviews.

Why metric cannibalization matters

The previous lesson established that ML metrics can betray business goals when the proxy relationship between what you optimize and what you actually care about breaks down. This lesson addresses the specific mechanism behind that breakdown and the production-grade defense against it.

Metric cannibalization is the phenomenon where improving the primary optimization metric causes a statistically significant degradation in another important business or user-experience metric. It is not a bug in the model. It is a predictable consequence of single-objective optimization in a multi-dimensional value space.

In ML system design interviews at L5 and above, interviewers expect candidates to proactively surface cannibalization risks before being prompted. Failing to do so signals a gap in production awareness. At the MAANG scale, even a 0.1% degradation in a secondary metric can affect tens of millions of users and translate into significant revenue or trust erosion.

Note: Cannibalization is not hypothetical. Public postmortems from Meta, YouTube, and major e-commerce platforms have documented cases where primary metric gains masked serious user-experience harm.

The industry-standard defense is straightforward in concept but nuanced in execution. Guardrail metrics act as hard constraints on the optimization process, ensuring the system does not cross predefined harm thresholds while pursuing primary metric gains. The rest of this lesson breaks down how cannibalization manifests, how guardrails are designed, and how production teams resolve the conflicts that inevitably arise.

Classic cannibalization patterns

Three canonical patterns recur across ML system design interviews and real production systems. Recognizing them quickly demonstrates design maturity.

1.The Interview Framework and Communication

2.Problem Formulation and Requirements

3.Data Strategy: Collection, Pipelines, and Features

4.Model Design and Architecture Selection

5.Evaluation: Offline, Online, and Fairness

6.Serving, Deployment, and MLOps

7.Case Study: Video Recommendation System

8.Case Study: Social Feed Ranking System

9.Case Study: Ad Click-Through Rate Prediction System

Mock Interview

10.Case Study: Semantic Search Engine

11.Case Study: Content Moderation System

Mock Interview

12.Case Study: Object Detection System

Mock Interview

13.Case Study: Visual Search System

Mock Interview

14.Case Study: Fraud Detection System

Mock Interview

15.Case Study: RAG-Based Enterprise Knowledge Assistant

16.Case Study: LLM-Powered Code Generation Tool

Metric Guardrails and Cannibalization

Why metric cannibalization matters

Classic cannibalization patterns