Search⌘ K
AI Features

Mobile System Design and API Design of YouTube

Explore the detailed mobile system design and API structure powering the YouTube app. Understand how MVVM-C architecture, caching, BFF aggregation, and backend services collaborate to enable seamless playback, personalized feeds, and efficient data flow on mobile devices.

In the previous lesson, we defined key design considerations such as MVVM-C architecture, playback state management, caching strategies, and life cycle handling. In this lesson, we translate those decisions into a concrete mobile System Design.

We will first explore the high-level System Design, then zoom into detailed mobile components, and finally define the API endpoints and data models that power feed retrieval, playback, search, and user interactions.

High-level mobile System Design

When a user opens the YouTube mobile app, the View layer renders the video feed and delegates user actions such as pull-to-refresh, thumbnail taps, and search queries to the ViewModel. The ViewModel invokes the Repository layer, which checks the local data source first. This local source includes disk-cached feed metadata and a segment cache for recently watched videos. If the local cache cannot satisfy the request, the Repository falls back to the remote data source via the Backend for Frontend (BFF)A server-side component purpose-built for a specific client platform that aggregates and trims responses from multiple backend microservices into a single, optimized payload..

BFF and back-end routing

The BFF aggregates calls to the video metadata service, recommendation engine, and CDN manifest resolver into a single mobile-optimized response. It trims unnecessary fields like full creator analytics or desktop-specific layout data that would waste bandwidth on a mobile device. The BFF forwards requests through the API gateway, which handles authentication, rate limiting, and request routing to backend microservices.

The response flows back through the same chain. The ViewModel updates observable state, and the View renders the feed or initiates playback. This mirrors the MVVM-C pattern established previously but now shows concrete data flow paths between each layer.

Note: Media content such as video segments and thumbnails is delivered directly from CDN edge nodes to the client’s media loader, bypassing the BFF entirely for binary payloads.

The following diagram illustrates this end-to-end high-level architecture.

Mobile client architecture with MVVM pattern, BFF, API Gateway routing to backend microservices, and direct CDN media delivery
Mobile client architecture with MVVM pattern, BFF, API Gateway routing to backend microservices, and direct CDN media delivery

With the high-level data flow established, the next section zooms in on the internal mobile architecture to reveal how coordinators, dependency injection, and use cases orchestrate this flow.

Detailed mobile System Design

The ...