Problem Statement and Metrics
Explore how to formulate the problem statement and design key metrics for personalized feed ranking, focusing on maximizing user engagement. Understand offline and online metrics such as Click Through Rate and Normalized Cross-Entropy. Discover requirements for training, personalization, data freshness, scalability, and latency in large-scale ranking systems.
LinkedIn feed ranking
1. Problem statement
Design a personalized LinkedIn feed to maximize long-term user engagement. One way to measure engagement is user frequency, i.e, measure the number of engagements per user, but it’s very difficult in practice. Another way is to measure the click probability or Click Through Rate (CTR).
On the LinkedIn feed, there are five major activity types:
- Connections (A connects with B)
- Informational
- Profile
- Opinion
- Site-specific
- Intuitively different activities have very different
. This is important when we decide to build models and generate training data.CTR Click Through Rate
| Category | Example |
|---|---|
| Connection | Member connector follows member/company, member joins group |
| Informational | Member or company shares article/picture/message |
| Profile | Member updates profile, i.e., picture, job-change, etc. |
| Opinion | Member likes or comments on articles, pictures, job-changes, etc. |
| Site-Specific | Member endorses member, etc. |
2. Metrics design and requirements
Metrics
Offline metrics
- The Click Through Rate (CTR) for one specific feed is the number of clicks that feed receives, divided by the number of times the feed is shown.
...