API Design and Data Model for Streaming System
Explore the design of APIs and data models essential for building efficient video streaming systems. Understand how to manage playback, personalized content feeds, search, ratings, and comments. This lesson helps you design scalable frontend APIs that deliver smooth streaming, optimize user interaction, and integrate seamlessly with backend services.
APIs and data models are the backbone of any scalable and efficient streaming platform. They define how clients interact with the system, fetch content, and manage user-specific preferences. Designing robust APIs is critical to ensuring smooth playback, accurate recommendations, and seamless user experiences in a system like a video streaming service. This lesson will walk you through essential APIs (covering design decisions that impact the performance) and their data models, covering features such as content search, playback, ratings, and user profiles.
We will explore the following key areas:
Section | What We Covered |
API architectural styles |
|
HTTP protocols |
|
Data formats |
|
Data fetching patterns |
|
API endpoints |
|
API endpoints: discuss data model, API design, request and response formats |
|
Let’s start with architectural styles!
Architectural styles
REST APIs retrieve fixed data structures, often leading to over-fetching (retrieving unnecessary data) or under-fetching (requiring multiple API calls). GraphQL, in contrast, allows the frontend to request only the data it needs, improving efficiency. For example, fetching metadata for a movie in REST will fetch all the data related to the movie, even if we only need the title and cast for adaptive UI rendering. Whereas, in GraphQL, we can only fetch the title and cast.
At first glance, we’d need GraphQL to fetch specific data from different services. However, we can filter the content on the service side while responding to the requests. As backend services handle filtering and optimization, an additional GraphQL layer would add unnecessary complexity.
REST is the better choice for video streaming because it efficiently delivers video content, integrates well with CDNs, and leverages caching for performance optimization. Each backend service (e.g., metadata, recommendations) already handles filtering internally, reducing the need for GraphQL’s flexible querying.
In video streaming, backend services already handle filtering and optimization. What does this involve, and why might it reduce the need for a GraphQL layer?
HTTP protocols
The choice of HTTP protocol plays a vital role in video delivery performance, scalability, and reliability. Our Streaming system adopts a mix of HTTP/2 and HTTP/3, allowing the client and server to negotiate the best available protocol at runtime. HTTP/2 offers strong compatibility and efficient multiplexing, while HTTP/3, built on QUIC, reduces latency and performs better in lossy or mobile networks. This hybrid approach ensures optimal streaming performance across various devices, browsers, and network conditions.
Data formats
For data formats, let’s evaluate the top three available formats:
JSON is suitable for structured metadata like video titles, descriptions, subtitles, and recommendations, as it is lightweight, human-readable, and easy to parse in the frontend.
Binary (e.g., MP4, WebM, HLS, MPEG-DASH) is the best choice for video content, as it provides efficient compression, fast transmission, and optimal storage, reducing latency and bandwidth usage.
XML has historically been used for video streaming manifests (e.g., MPEG-DASH MPD files) but is more verbose and slower to parse than binary formats or lightweight JSON structures.
As video playback requires efficient data handling, binary formats are used for media files, while JSON remains the best choice for metadata, ensuring fast, scalable, and optimized streaming performance.
Data fetching patterns
For on-demand video streaming, polling (REST API calls) is sufficient for fetching content like video files, metadata, and recommendations. As the data is mostly static after retrieval, there’s no need for WebSockets or SSE, making polling the most efficient and scalable choice.
However, if we opt to include live streaming, a hybrid approach is preferred—WebSockets for interactive features and SSE for real-time updates, ensuring low latency and efficient resource usage.
The following table summarises what we have decided for different aspects of API design and data:
Aspect | Decision |
Architecture style | REST |
Communication protocol | HTTP/2 |
Data format | JSON, Binary |
Data fetching pattern | Polling |
API endpoints
We can use the following endpoints and methods to access different services of streaming frontend systems:
Let’s explore the functionality of specific APIs and their corresponding data models to fulfill functional requirements.
1. Playback media
The playback process involves multiple steps to ensure smooth media delivery and adaptive streaming. It begins by fetching the manifest file (e.g., DASH .mpd or HLS .m3u8), which contains essential information like available resolutions, segment URLs, and subtitles. Once the manifest is retrieved, the player can progressively make segmented requests to fetch and play the media. The playback model also tracks session details such as playback progress, resolution preferences, and user interactions like pausing or resuming playback.
We define a data model to store content-related data, which is later used to create a manifest file that is further useful for playback.
We’ll divide the APIs into two steps: fetching the manifest file and starting playback.
I. Fetching the manifest file
Before starting playback, the manifest file (e.g., DASH (.mpd) or HLS (.m3u8)) should be fetched to provide information like available resolutions, bitrates, and segment URLs. The player uses this for adaptive streaming. The fetchManifest() API accepts a request with the itemID for which a manifest file is needed.
HTTP method: To stream video, the clients must send an HTTP
GETrequest to retrieve the manifest file from the server before playing the content.Request format: An example request header is given below to get the manifest file for the video.
Response format: The returned manifest file contains information about the total number of segments, playback time, different supported resolutions, and other metadata such as (
thumbnails,tags,categories, etc.) associated with the video. The response returned by the server to the above request is as follows:
II. Play content
Once we have the manifest file, we can use the data to fetch video segments using segment URLs based on network conditions. It uses the playback() API to play content.
We’ll use the following HTTP method to do so:
HTTP method: We can access the video and audio segments through an HTTP
GETrequest using the specified URL returned in the manifest file.Request format: The request contains
itemID,videoSegID, and other parameters to fetch video segments based on current network conditions.Response format: The server will return the video content associated with the
segmentIdspecified in the request URL. A sample response for any requested video segment is given below:
Why is it beneficial to stream audio and video as separate segments in a video streaming application?
III. Playback configuration
A session is created when an initial request to play any video is received. Each session is identified by a session ID, which is stored on the client (e.g., in a cookie) and sent with each request. This allows the server to associate incoming requests with a specific session, even if the user is not logged in. It is later used to send an HTTP POST request to the playback() API to update the current playback configuration on the server side for logged-in users.
For example, when a user plays a video on their laptop and decides to watch the rest of the video later on another device. Video playback progress is not just cached on the client side but is also periodically sent to the server to track playback position for features like “continue watching” or multi-device resumption. Here’s how this is typically handled:
Client-side caching: The video player locally tracks the current playback position (
currentPosition) in real time. This state can be saved temporarily if the user pauses or closes the player.Periodic updates to the server: The client periodically sends the playback progress to the server using a heartbeat mechanism or via a dedicated API, e.g.,
POST /v1.0/playback. The server then stores this information in a user-specific table for later retrieval.
We’ll use a dedicated API to accurately update the stats on the server. The clients can specify the video and audio configurations through an HTTP POST request which contains parameters like currentPosition in the video (progress), userID, and itemID.
Note: Setting the configuration and saving user activity logs is only available for logged-in users. We can use cookies or localStorage to temporarily store configuration and activity logs client-side for users not logged in. This will not persist across devices.
2. Show content
Users who log in or browse the home page have personalized video carousels (we’ll name them content feeds) based on their viewing history, preferences, and trending content.
We use the following data model for content, combining user data (e.g., interests) and video data (e.g., timestamps, popularity) to generate a personalized movie feed:
The showFeed() API fetches the list of recommended media for the user. It uses filters like genres, watch history, and user preferences to personalize the feed.
3. Search content
Users can search for specific content by typing keywords, filtering by genre, or searching for actors to explore available media. We store searchable metadata for movies, TV shows, and other content. We query the FeedItems as the primary data source for all searchable content. The search functionality matches user-provided filters such as region, genre, and specific queries. The response returns multiple items that match the search criteria.
The search() API enables users to search for available content by filtering across multiple parameters.
4. Rate content
Users can submit ratings and optional reviews, which help personalize future recommendations and inform other viewers. Ratings are stored with user-specific feedback on a media item.
The rate() API allows users to submit a rating for a media item.
5. Create a comment
Users can share their opinions by leaving comments on movies or shows, enabling discussion and feedback within the platform. The data model and API design to add comments are given below:
The Comment() API accepts an itemID, userID, parentCommentID, and text (comment) on content, reflect it instantly under the item’s content, and update the comment count.
Note: The
parentCommentIDis set tonullby default. And if a comment is a reply to an already available comment, a unique ID of the parent comment is sent in the parameters.
The following table summarizes the APIs and data models for core functions in a streaming service:
Function | API Method | Endpoints | Data Entities |
Fetching manifest file |
|
|
|
Playback media |
|
|
|
Playback configuration |
|
|
|
Show content feeds |
|
|
|
Search content |
|
|
|
Rate content |
|
|
|
Add comment |
|
|
|
Conclusion
This lesson explores how API design and data models are critical in creating a robust streaming service. We covered essential functionalities like fetching personalized feeds, starting playback, submitting ratings, and configuring user profiles. By understanding these APIs and how they connect to the backend, developers can ensure seamless interactions and data retrieval for users.
In the next lesson, we will optimize the frontend System Design for the streaming platform using different techniques to enhance responsiveness, improve load times, and deliver an instant and engaging user experience.