Feed Ranking System Design
Learn about the Feed Ranking system design for the LinkedIn application.
4. Calculation & estimation
Assumptions
- 300 million monthly active users
- On average, a user sees 40 activities per visit. Each user visits 10 times per month.
- We have 12 * or 120 billion observations/samples.
Data size
-
Assume the click through rate is about 1% for 1 month. We collected 1 billion positive labels and about 110 billion negative labels. This is a huge dataset.
-
Generally, we can assume that for every data point, we collect hundreds of features. For simplicity, each row takes 500 bytes to store.
-
In one month, we need 120 billion rows. Total size: 500 * 120 * = 60 * bytes = 60 Terabytes. To save costs we can keep the last 6 months or 1 year of data in the data lake and archive old data in cold storage.
Scale
- Supports 300 million users
5. High-level design
Get hands-on with 1200+ tech skills courses.