Top 20 System Design Interview Questions in 2026
Here are the essential System Design Interview questions, categorized by difficulty level. Drawing on over a decade of experience at Microsoft and Facebook, I emphasize the importance of these questions in assessing a candidate’s understanding and application of System Design fundamentals.
We'll cover the following...
- Easy System Design interview questions
- Tips for any SDI question
- Easy System Design interview questions
- Medium System Design interview questions
- 7. Design a chat service like Facebook Messenger or WhatsApp
- System Design and workflow
- 8. Design a mass social media service like Facebook or Instagram
- Requirements
- System Design and workflow
- 9. Design a proximity service like Yelp or nearby places/friends
- 10. Design a search engine-related service like Typeahead
- 11. Design a video streaming service like YouTube or Netflix
- 12. Design a ride-sharing service like Uber or Lyft
- 13. Design a social network service like Reddit or Quora
- Hard System Design interview questions
- 14. Design a text-to-text generation system like ChatGPT
- 15. Design a code deployment system
- 16. Design a social media newsfeed service
- 17. Design a collaborative editing service like Google Docs
- 18. Design Google Maps
- 19. Design a payment gateway like Stripe
- 20. Design a Scalable Data Platform for ML/AI Systems
- Bonus Design problems
- Conclusion
While SDI questions tend to evolve, many have remained popular over time. These questions are well-suited to evaluate candidates on two important levels:
Test the candidate’s understanding of System Design fundamentals.
Evaluate the candidate’s ability to apply those fundamentals in real-world applications.
We’ll break down the top 20 System Design interview questions. These are essential questions asked at top companies such as Google, Amazon, Meta, and others. Mastering these problems and their solutions will give you a significant advantage in your System Design interview prep.
To help meet you at your current preparation level, we have divided these essential System Design problems into three difficulty levels:
Easy System Design interview questions
Design an API rate limiter for sites like Firebase or GitHub
Design a pub/sub system like Kafka
Design a URL-shortening service like TinyURL or
bit.lyDesign a scalable content delivery network (CDN)
Design a web crawler
Design a distributed cache
Medium System Design interview questions
-
Design a chat service like Facebook Messenger or WhatsApp
-
Design a mass social media service like Facebook or Instagram
-
Design a proximity service like Yelp or nearby places/friends
-
Design a search engine-related service like Typeahead
-
Design a video streaming service like YouTube or Netflix
-
Design a ride sharing service like Uber or Lyft
-
Design a social network and message board like Reddit or Quora
Hard System Design interview questions
-
Design a text-to-text generation system like ChatGPT
-
Design a code deployment system
-
Design a social media newsfeed service
-
Design a collaborative editing service like Google Docs
-
Design Google Maps
-
Design a payment gateway like Stripe
-
Design a Scalable Data Platform for ML/AI Systems
Before we start breaking down specific questions, we want to provide you with some high-level System Design tips that will enable you to approach any problem with confidence.
Tips for any SDI question
Start each problem by stating what you know: List all required features of the system, common problems you expect to encounter with this sort of system, and the traffic you expect the system to handle. The listing process allows the interviewer to assess your planning skills and correct any misunderstandings before you begin the solution.
Narrate any trade-offs: Every System Design choice matters. At each decision point, list at least one positive and one negative effect of that choice.
Ask your interviewer to clarify: Most System Design questions are purposefully vague. Ask clarifying questions to demonstrate to the interviewer how you understand the question and your knowledge of the system’s needs. Also, be sure to state your assumptions before diving into the components.
Know your architectures: Most modern services are built upon a flexible microservice architecture. Unlike the monolithic architectures of the past, microservices enable smaller, agile teams to build independently within the larger system. Some older companies may have legacy systems, but microservices can coexist in parallel with legacy code and help refresh the company’s architecture.
Discuss emerging technologies: Conclude each question with an overview of how and where the system could benefit from generative AI (GenAI) and machine learning (ML). This will demonstrate that you’re prepared for not only current solutions but also future solutions.
System Design interview cheat sheet
As an added bonus, we highly recommend downloading this interview cheat sheet and internalizing its contents. (Pro tip: You may even want to set it as the background for your desktop!)
Now, let’s examine the specifics of the top System Design interview questions, starting with the easy problems.
Easy System Design interview questions
We will provide a problem statement, requirements, and workflow for each question with a high-level design.
1. Design an API rate limiter for sites like Firebase or GitHub
Problem statement: Design an API rate limiter that caps the number of API calls the service can receive in a given period to avoid an overload.
Sample clarifying questions!
Which entity is rate-limited: the user, IP address, token, or API key?
Are the rate-limiting rules configurable at runtime?
What is the expected scale in requests per second?
Requirements
Follow these requirements for a rate limiter system:
Functional requirements
- Limit requests
- Configurable
- Error or notification if the limit is reached
Nonfunctional requirements
- Availability
- Low latency
- Scalability
System Design and workflow
According to the following high-level rate limiter, the client’s requests are passed through an ID builder, which assigns unique IDs to the incoming requests. The ID could be a remote IP address, login ID, or other attributes. The decision maker retrieves the throttling rules from the database and makes a decision based on them. It either forwards the requests to application servers via the request processor or discards them and provides the client an error message (429 Too many requests). If some requests are throttled due to a system overload, the system keeps those requests in a queue to be processed later.
Knowledge test!
How does your system measure requests per minute? If a user makes 10 requests at 00:01:20 and then another 10 at 00:02:10, they’ve made 20 in the same one-minute window despite the minute change.
In the event of a failure, a rate limiter would be unable to perform the task of throttling. Should the request be accepted or rejected in such a scenario?
What changes would you make to the design while considering the rate limiter design for a distributed system rather than a local one?
Note: Look at the detailed design of the rate limiter to find the answers to the questions above.
2. Design a pub/sub system like Kafka
Problem statement: Design a scalable and distributed pub/sub system like Kafka that can handle massive message throughput. It should also ensure reliable message delivery and support various messaging semantics (at most once, at least once, exactly once).
Sample clarifying questions!
What message delivery guarantee is required: at-most-once, at-least-once, or exactly-once?
Is message ordering important within topics or partitions?
How long should messages be retained in the system?
Requirements
Follow these requirements for the pub/sub design:
Functional requirements
- Create a topic
- Write messages
- Subscription
- Read messages
- Specify retention time
- Delete messages
Nonfunctional requirements
- Availability
- Scalability
- Durability
- Fault-tolerant
- Concurrency management for simultaneous reads and writes
System Design and workflow
The brokers are responsible for storing messages sent from producers and allowing consumers to read them. Similarly, the cluster manager is responsible for monitoring the broker’s health and spinning up another broker in the event that one goes down. The consumer’s details include subscription information, retention period, and other relevant details. The consumer manager manages consumers, who in turn manage consumers’ access to messages within existing topics.
Knowledge test!
How can message delivery be ensured and semantics guaranteed at least once or at most once in the pub/sub design?
How can you ensure message orders are delivered to specific consumers?
Note: To answer the above technical questions, you can examine the detailed design of pub/sub.
3. Design a URL-shortening service like TinyURL or bit.ly
Problem statement: Design a scalable and distributed system that shortens long URLs like TinyURL or bit.ly. The system takes a long URL and generates a new, unique short URL. It should also take a shortened URL and return the original full-length URL.
Sample clarifying questions!
Should shortened URLs be globally unique or user-specific?
Are custom aliases supported, and how are collisions handled?
Do URLs expire, or are they stored permanently?
Requirements
Follow these requirements for the URL-shortening system:
Functional requirements
- URL generation
- URL storage
- Redirection to the original URL
- Customization of URLs
- Update and delete URLs
Nonfunctional requirements
- Scalability
- Availability
- Unpredictability in URL generation
- Readability
- Low latency
System Design and workflow
A load balancer is the first intermediary between the clients and the server, ensuring even distribution of incoming requests to maintain availability and reliability. When a new URL-shortening request comes in, the load balancer forwards it to a server where the rate limiter checks if the client is within the allowed request rate.
The server utilizes a sequencer to generate a unique numeric ID for each URL request. This ID is passed to an encoder, which converts it into a more readable alphanumeric string. The original URL and its corresponding shortened version are stored in a database. To enhance performance, recently accessed URLs are kept in a cache, allowing quick retrieval without repeatedly querying the database.
Knowledge test!
What if two users input the same custom URL?
What if there are more users than expected?
How does the database regulate storage space?
Note: To explore in depth to get the answer to the above questions, check out the detailed chapters on the TinyURL System Design.
4. Design a scalable content delivery network (CDN)
Problem statement: Design a scalable content delivery network (CDN) system to efficiently distribute and cache content across globally distributed servers, minimizing latency and ensuring reliable end-user content delivery.
Sample clarifying questions!
What types of content will the CDN serve: static, dynamic, or both?
What is the regional traffic distribution and expected scale?
Requirements
Follow these requirements for a CDN system:
Functional requirements
- Retrieve content from the origin server
- Respond to user requests
- Auto content delivery from the origin server
- Search
- Update content from the origin or peer CDNs
Nonfunctional requirements
- Scalability
- Availability
- Reliability
- Security
- Low latency
System Design and workflow
When a client requests content, a request routing system kicks in to find the address of the nearest or fastest server, ensuring minimal wait time. A load balancer then routes the request to the optimal server. If the requested content is cached on that server, it is immediately delivered to the client. If not, the server fetches the content from the origin server, caches it locally for more such requests, and then serves it to the user.
The CDN system ensures that frequently accessed content remains readily available while less popular content is periodically purged. The system also includes monitoring and analytics to track performance, optimize routing, and ensure high availability and reliability.
Knowledge test!
How would you determine which content to cache on edge servers?
How would you distribute traffic evenly across multiple edge servers?
How would you ensure the CDN infrastructure’s scalability, availability, and fault tolerance?
How would you optimize the delivery and reduce the latency while streaming?
Note: Check out the chapter on the design of a content delivery network to help you understand and get answers to the above questions.
5. Design a web crawler
Problem statement: Design a web crawler that systematically browses the internet to discover and index web pages. The crawler should efficiently navigate websites, retrieve content, and follow links to discover new pages.
Sample clarifying questions!
Should the crawler extract media content like images and videos or only HTML?
Should the crawler obey
robots.txtand crawl-delay rules?What is the depth and frequency of crawl required per domain?
Requirements
Follow these requirements for the web crawler system:
Functional requirements
- Crawling
- Storing crawled content
- Scheduling for periodic crawling
Nonfunctional requirements
- Scalability
- Consistency
- Reliability
- Extensibility to network protocols
System Design and workflow
A web crawler begins by assigning a worker to a URL. Once the DNS is resolved, the worker sends the URL and IP address to an HTML fetcher to establish the connection. The URL and HTML content are extracted from the page and stored in the cache for processing. The duplicate eliminator service then tests this content to ensure no duplicate content is transferred to blob storage. Once this cycle is complete for a single URL, it moves on to the next address in the queue.
Knowledge test!
What functionalities must be added to extract all formats (images and video)?
Real web crawlers have multiple workers handling separate URLs simultaneously. How does this change the queuing process?
How can you account for crawler traps?
Note: To get the answers to the above questions, check out the detailed chapters on the web crawler System Design.
6. Design a distributed cache
Problem statement: Design a distributed caching system that provides fast, scalable, and reliable data retrieval across multiple servers. The system should efficiently manage cache consistency, handle high volumes of read and write requests, ensure data availability, and provide mechanisms for cache eviction and expiration.
Sample clarifying questions!
What should be the typical read-to-write ratio in expected workloads?
Should the cache support write-through or write-back strategies?
Will the cache operate across regions or within a single data center?
Requirements
Follow these requirements for the distributed cache system:
Functional requirements
- Insert or write data
- Retrieve data
- Data partitioning
- Cache eviction
Nonfunctional requirements
- Scalability
- Consistency
- Low latency
- High availability
System Design and workflow
A distributed caching system begins by partitioning the data across multiple cache nodes to balance the load and improve access speed. When a client requests data, an application server determines the appropriate cache node based on a consistent hashing algorithm, ensuring an even distribution of requests and quick lookups.
If the data is found in the cache (a cache hit), it is returned to the client immediately, significantly reducing latency. If the data is not found (a cache miss), the system retrieves it from the primary data store, caches it, and then serves it to the client. Cache eviction policies, such as least recently used (LRU) or time-to-live (TTL), manage the removal of stale data to free up space.
Knowledge test!
How do you ensure data consistency across multiple cache nodes, especially during updates and deletions?
What strategies can be implemented to handle cache misses efficiently without overloading the primary data store?
What methods can maintain low latency and high throughput under heavy load conditions?
How do you secure the cache data against unauthorized access and ensure privacy?
Note: To answer such conceptual questions, check out the detailed design of the distributed cache.
Medium System Design interview questions
We will provide the problem statement, requirements, workflow, and system architecture for each medium system design question.
7. Design a chat service like Facebook Messenger or WhatsApp
Problem statement: Design a scalable, reliable, and secure real-time chat service like Facebook Messenger or WhatsApp to support instant messaging, group chats, notifications, and multimedia sharing.
Sample clarifying questions!
Should the system support both one-to-one and group chats?
Are messages required to be end-to-end encrypted?
Should messages be stored indefinitely or have a retention policy?
Requirements
Follow these requirements for the WhatsApp System Design:
Functional requirements
- Real-time communication (individual/group)
- Message delivery acknowledgment
- Sharing of media content
- Chat storage
- Notifications
Nonfunctional requirements
- Availability
- Low latency
- Scalability
- Consistency
- Security
System Design and workflow
In a real-time communication system, senders and receivers are connected to chat servers. Chat servers deliver messages from sender to receiver via a messaging queue. Various protocols, such as WebSocket, XMPP, MQTT, and real-time transport protocol, can be utilized for real-time communication. For this purpose, a manager establishes real-time connections between clients and chat servers; for instance, assume the WebSocket manager establishes WebSocket connections between users and different chat servers. Similarly, the messages can be persistently stored in the database.
Knowledge test!
What happens if a message is sent when the user isn’t connected to the internet? Is it sent when the connection is restored?
How will you encrypt and decrypt the message without increasing latency?
How do users receive notifications?
Are messages pulled from the device (the server periodically prompts the devices if they’re waiting to send a message), or are pushed to the server (the device prompts the server that it has a message to send)?
Note: Look at the detailed design of the real-time chat service to get answers to such questions.
8. Design a mass social media service like Facebook or Instagram
Problem statement: Design a social media service used by several million users like Instagram. Users should be able to view a newsfeed with posts by following users and suggesting new content that the user may like.
Sample clarifying questions!
Should feed generation be on write, on read, or hybrid?
Should the system support images, video, or only text content?
How personalized should the user feed be?
Requirements
Follow these requirements for the Instagram system:
Functional requirements
- Create a post
- Delete a post
- Edit a post
- Share a post
- Follow and unfollow users
- Search for content
- View the system’s generated feed
- Like and dislike posts
Nonfunctional requirements
- Scalability
- Availability
- Low latency
- Reliability
- Security
Based on the above requirements, let’s create a high-level design of a feed-based social system like Instagram.
System Design and workflow
The high-level design of a feed-based social network encompasses posts, timeline generation, a feed publishing service, and a feed ranking and recommendation engine. The post-service handles the clients’ posts, and the post is published on the client’s wall (page). Similarly, the timeline generation service generates feeds for friends and followers. The timeline generation service utilizes the feed ranking and recommendation engine, which ranks and recommends the top N posts to followers based on their interests, searches, and watch history. The generated feed is stored in the database, and the feed publishing service is responsible for publishing and showing the generated feeds to followers. As the feed could contain videos, the CDN is responsible for delivering the videos to followers with low latency.
Knowledge test!
Influencers or celebrities will have millions of followers; how are they handled vs. standard users?
How does the system weigh posts by age? Old posts are less likely to be viewed than new posts.
What’s the ratio of
readandwritefocused nodes? Are there likely to be more read requests (users viewing posts) or write requests (users creating posts)?How can you increase availability? How does the system update? What happens if a node fails?
How do you efficiently store posts and images?
Note: Look at the detailed design of Instagram for a better understanding.
9. Design a proximity service like Yelp or nearby places/friends
Problem statement: Design a proximity server that stores and reports the distance to places like restaurants. Users can search nearby places by distance or popularity. The database must store data for hundreds of millions of businesses across the globe.
Sample clarifying questions!
How should results be sorted: by distance, rating, or popularity?
Is real-time tracking needed for friends or businesses?
Requirements
Follow these requirements for a System Design like Yelp:
Functional requirements
- User accounts
- Search
- Feedback
Nonfunctional requirements
- Scalability
- High availability
- Consistency
- Performance
System Design and workflow
The system handles search requests by using load balancers to distribute read requests to the read service, which then queries the quadtree service to identify relevant places within a specified radius. The quadtree service also refines the result before being sent to the clients. For adding places or feedback, write requests are similarly routed through load balancers to the writing service, which updates a relational database and stores images in blob storage. The system also involves segmenting the world map into smaller parts, storing places in a key-value store, and periodically updating these segments to include new places, although this update happens monthly due to the low probability of new additions.
Knowledge test!
How do you store lots of data and retrieve search results quickly?
How should the system handle different population densities?
latitude/longitude grids will cause varied responsiveness based on density.Rigid This is a fixed, inflexible grid structure, which is used to divide geographic space into latitude and longitude coordinates. Can we optimize commonly searched locations?
Note: Look at the detailed design of Yelp to get answers to the above questions.
10. Design a search engine-related service like Typeahead
Problem statement: Design a typeahead suggestion system that provides real-time, relevant autocomplete and autocorrect suggestions as users type, ensuring low latency and scalability to efficiently handle a large volume of queries.
Sample clarifying questions!
What is the maximum allowed latency for suggestions?
Should the system adapt to user search history and preferences?
How often should the autocomplete dataset be refreshed?
Requirements
Follow these requirements for the system:
Functional requirements
- Autocomplete
- Autocorrect
Nonfunctional requirements
- Scalability
- Fault tolerance
- Performance
System Design and workflow
When a user starts typing a query, each character is sent to an application server. A suggestion service gathers the top N suggestions from a distributed cache, or Redis, and returns the list to the user. An alternate service, the data collector and aggregator, takes the query, analytically ranks it, and stores it in a NoSQL database. The trie builder is a service that takes the aggregated data from the NoSQL database, builds tries, and stores them in the trie database.
Knowledge test!
How strongly do you weigh spelling mistake corrections?
How do you update selections without causing latency?
How do you determine the most likely completed query? Does it adapt to the user’s searches?
What happens if the user types very quickly? Do suggestions only appear after they’re done?
Note: Look at the detailed design of the Typeahead system for a better understanding of the system.
11. Design a video streaming service like YouTube or Netflix
Problem statement: Design a video streaming service like YouTube or Netflix that allows users to upload and stream videos. The service should efficiently store many videos and their metadata and return accurate and quick results for user search queries.
Sample clarifying questions!
What is the expected volume of uploads and concurrent streams?
Are live streaming features needed, or only on-demand?
Requirements
Follow these requirements for a streaming service System Design:
Functional requirements
- Search videos
- Upload videos
- Stream videos
- Rate videos
Nonfunctional requirements
- Availability
- Scalability
- Low latency (to stream a video)
- Support multiple formats
System Design and workflow
A load balancer first handles video upload requests by sending them to the application servers. The applications server interacts with the video service, which triggers transcoders to convert the video to different formats. These typically range from 140p to 1440p but can reach 4K resolutions. The formatted video is then saved to the blob store, and its metadata is stored on the metadata database. The video service sends the transformed video to CDNs for quick content delivery to end users. Popular and recent uploads are held in a CDN. A content delivery network, or CDN, reduces latency when delivering video to users. The CDN stores and delivers requested data to users in conjunction with colocation sites.
Knowledge test!
How will your service ensure smooth video streaming on various internet qualities?
How are the videos stored?
How will the system provide a personalized experience to each user with recommendations?
How does the system react to a sudden drop in the network, shifting to low-quality, buffering content, etc.?
Note: Check out the detailed chapter on YouTube System Design that answers the above concerns during the design.
12. Design a ride-sharing service like Uber or Lyft
Problem statement: Design a system for a ride-sharing service similar to Uber, where users can request rides and drivers can accept these requests. The system should efficiently match drivers with riders based on location and availability, handle real-time updates on ride statuses, manage payments securely, and ensure a seamless user experience from booking to ride completion.
Sample clarifying questions!
Should the system support different ride types (economy, premium, carpool)?
How frequently are driver and rider locations updated?
Are wallet systems, promotions, or refunds part of the payment system?
Requirements
Follow these requirements for the System Design:
Functional requirements
- Location tracking
- Request a ride
- Show nearby drivers
- Calculate and notify ETA
- Trip process (confirmation and updates)
- Payment
Nonfunctional requirements
- Scalability
- Availability
- Reliability
- Low latency
- Consistency
- Security
System Design and workflow
A user’s request is sent to the application server via a load balancer and API gateway. The system accepts the rider’s request, and the trip service or manager provides an estimated time of arrival (ETA) based on the type of vehicle. The drivers and location manager use a matching algorithm to find the nearest available drivers and send the request to those drivers by notifying them via a notification service. When a driver matches with a rider, the application should return the trip and rider information. The driver’s location is regularly recorded and communicated to relevant users through a pub/sub service.
Once the ride is complete, the trip manager ensures payment is securely processed through a payment gateway. We leverage a database that stores user and driver profiles, ride history, and payment information. We also utilize caching mechanisms to expedite access to frequently requested data, and continuous monitoring ensures the service operates smoothly.
Knowledge test!
How can you keep latency low during busy periods?
How is the driver paired with the user? Iterating over all drivers to find the Euclidean distance would be inefficient.
What happens if the driver or user loses connection?
How would you update the ETA during a ride in peak hours?
Note: Check out the detailed chapter on Designing Uber that answers the above concerns during the design.
13. Design a social network service like Reddit or Quora
Problem statement: These social network sites operate on a forum-based system that allows users to post questions and links. For simplicity’s sake, focus more on designing Quora. You’ll unlikely need to walk through the design of something like Reddit’s subreddit or karma system in an interview.
Sample clarifying questions!
What types of content are supported: text, images, videos, links?
Should voting affect visibility globally or per user?
Are real-time notifications required for interactions?
Requirements
Follow these requirements for a System Design like Quora:
Functional requirements
- Post questions and answers
- Vote and comment
- Search
- Answer ranking
- Recommendation system
Nonfunctional requirements
- Scalability and consistency
- Availability
- Performance
System Design and workflow
In Quora’s high-level design, users interact through a web server, which communicates with an application server to handle actions such as posting questions, answers, and comments. Content, such as images and videos, is stored in blob storage, while question-and-answer data, along with user profiles and interactions, are stored in a MySQL database.
A machine learning engine analyzes user interactions and content to rank answers based on relevance and quality. This engine continuously learns from user feedback to improve its ranking algorithms. For personalized user experiences, a recommendation system utilizes machine learning models to tailor content based on individual interests and behaviors.
Knowledge test!
How can you ensure the system’s scalability to handle millions of simultaneous users posting questions and answers?
What strategies can efficiently store and retrieve large multimedia content in blob storage?
How would you design the database schema to manage the relationships between users, questions, answers, and comments in a scalable way?
What techniques can be used to effectively rank answers, ensuring that high-quality content is prioritized for users?
How can you optimize the performance of the machine learning engine to rank answers quickly and accurately?
Note: Check out the detailed chapter on Quora System Design to help you understand the system.
Hard System Design interview questions
Hard System Design interview questions pertain to complex, open-ended problems that necessitate in-depth technical knowledge, critical thinking, and the ability to design scalable and efficient systems within constraints. Let’s start with the System Design of a ChatGPT-style service.
14. Design a text-to-text generation system like ChatGPT
Problem statement: Design a scalable text generation system that can handle real-time, context-aware conversations. The system must accept user prompts, generate human-like responses, maintain conversation context, and support personalization, while ensuring low latency, high availability, and secure handling of user data.
Sample clarifying questions:
How should the system maintain conversation context across multiple interactions?
Do we need to support personalized responses based on user history?
Should the system provide moderation for inappropriate content?
Requirements
Follow these requirements for the ChatGPT-style service:
Functional requirements
- Dialogue management (context-aware conversations)
-
Natural language understanding and intent extraction
-
Personalization based on prior interactions
-
Feedback-based improvements
Nonfunctional requirements
-
Scalability
-
Low latency
-
High availability and reliability
-
Privacy and security of user data
System Design and workflow
When a user submits a prompt, the API Gateway handles authentication and rate limiting, and a load balancer distributes the request to application servers. The input is pre-processed by the NLU service, which extracts intent, entities, and embeddings to maintain context. Model servers generate responses using conversation history. Responses pass through content moderation, while user feedback is collected to improve the system. Finally, the response is returned to the user, enabling a fast, context-aware conversation.
The following illustrations show a high-level design of a ChatGPT-style service:
Knowledge test!
Why is response streaming important in a conversational AI platform, and how does it affect perceived latency?
What are the trade-offs between stateless vs. stateful architecture for managing conversation history?
What mechanisms would you implement to ensure the system filters out harmful or inappropriate content in real-time?
15. Design a code deployment system
Problem statement: Design a reliable and scalable code deployment system for a large-scale distributed application. The system should automate the building, testing, and deployment of code changes across environments with minimal disruption, while also providing the ability to monitor and roll back changes as needed.
Sample clarifying questions!
What rollback strategy is required: full, partial, or per environment?
Is deployment approval manual, automated, or a combination of both?
Requirements
Follow these requirements for a code deployment system:
Functional requirements
- Version control integration
- Automated code building
- Multi-environment deployment
- Environment configuration
- Automated rollbacks
- Deployment monitoring
- Support for deployment strategies
Nonfunctional requirements
- Availability
- Fault tolerance
- Performance
- Scalability
- Security
System Design and workflow
The high-level design of the code deployment system includes all the major components needed to meet the outlined requirements. The process begins when developers submit code to a version control system (VCS). Any new code changes trigger a continuous integration (CI) service, which automatically integrates updates, runs preliminary tests, and prepares the code for deployment. Once validated, the code is published to a queue, which decouples build triggers from execution.
A dedicated build service listens to this queue and retrieves jobs to compile the code. It then generates binary artifacts and stores them in a versioned blob storage system. These artifacts represent the system’s deployable output. When it’s time to deploy, the deployment service pulls the necessary artifacts from blob storage and installs them on machines across different regions. This ensures consistent deployments across multiple environments, including staging and production.
The architecture supports gradual rollouts, rollback mechanisms, and monitoring at each step, helping to reduce risks and improve reliability in production.
A high-level design of a code deployment system is depicted in the following illustration:
Knowledge test!
How would you ensure zero-downtime deployments in this system?
What are the key considerations when designing for rollback capability?
If deployments fail in only one region, how would you isolate and debug the issue without affecting global deployments?
As your engineering team grows and deploys more frequently, what changes would you make to maintain fast and stable builds?
Note: Look at the Design a Deployment System for a better understanding of the system.
16. Design a social media newsfeed service
Problem statement: Design a scalable and efficient social media newsfeed system that delivers personalized, real-time content updates to users, ensuring low latency, high availability, and scalability.
Sample clarifying questions!
Should the feed be push-based, pull-based, or a hybrid of both?
What level of personalization is required?
Does the feed support multimedia content, such as images and videos?
Requirements
Follow these requirements for the design:
Functional requirements
- Newsfeed generation
- Newsfeed contents
- Newsfeed display
Nonfunctional requirements
- Scalability
- Fault tolerance
- Availability
- Low latency
System Design and workflow
In the following high-level design of a newsfeed system, clients post or request their newsfeed through the app, which the load balancer redirects to a web server for authentication and routing. Whenever a post is created via the post service and available from a user’s friends (or followers), the notification service informs the newsfeed generation service, which generates newsfeeds from the posts of the user’s friends (followers) and keeps them in the newsfeed cache. Similarly, the generated feeds are published by the newsfeed publishing service to the user’s timeline from the news feed cache. It also appends multimedia content from the blob storage with a news feed if required.
Knowledge test!
Creating and storing newsfeeds for each user in the cache requires a substantial amount of memory. Is there any way to reduce this memory consumption?
What mechanisms would you implement to prioritize and filter content in the newsfeed to prevent information overload for users?
How can the system ensure consistency and order of posts in the newsfeed, especially in a distributed environment with multiple data centers?
Note: If you need answers to such questions, look at the detailed design of a newsfeed service.
17. Design a collaborative editing service like Google Docs
Problem statement: Design a collaborative editing service that lets users remotely and simultaneously make changes to text documents. The changes should be displayed in real time. Like other cloud-based services, documents should be consistently available to any logged-in user on any machine. Your solution must be scalable to support thousands of concurrent users.
Sample clarifying questions!
What collaboration model is used: character-level or paragraph-level?
Should the system support offline editing and later sync?
How will conflicts between concurrent edits be resolved?
Requirements
Follow these requirements for the Google Docs system:
Functional requirements
- Collaboration
- Edit overlap
- Autocomplete and grammatical suggestions
- History and view count
- Manage documents
Nonfunctional requirements
- Consistency
- Availability
- Low latency
System Design and workflow
Clients’ requests are forwarded to the operations queue, where conflicts are resolved between different collaborators, and the data is stored in the time series database and blob storage (responsible for storing media files). Autocomplete suggestions are made via the typeahead service. This service resides on the Redis cache to enable low-latency suggestions and enhance the speed of the regular updates process. The application servers perform several important tasks, including importing and exporting documents. Application servers also convert documents from one format to another. For example, a .doc or .docx document can be converted into .pdf or vice versa.
Knowledge test!
How do you minimize latency when multiple users are distant from the server?
What techniques for conflict resolution are best for ensuring consistency?
Note: If you need answers to such questions, look at the detailed design of Google Docs.
18. Design Google Maps
Problem statement: Design a service that can map the route between two locations. The system should map several optimal paths to a destination based on the mode of travel. Each route should display the total mileage and an estimated time of arrival.
Sample clarifying questions!
What travel modes should be supported: driving, cycling, walking, public transport?
How frequently should traffic data be updated?
Is offline route planning and navigation required?
Requirements
Follow these requirements for the Google Maps system:
Functional requirements
- Real-time navigation
- Location/Area search
- Route search/finder
- Route planning
- Real-time notification
Nonfunctional requirements
- Scalability
- Reliability
- Low latency
- Accuracy
System Design and workflow
In the Google Maps system, clients request location-based services, such as finding a route or searching for nearby points of interest. The load balancer directs requests to various services based on the nature of the query.
For routing requests, the route finder service calculates the optimal paths between two or more points, utilizing both real-time and historical data. It relies on the graph processing service to perform complex calculations on the road network graph stored in the graph database. The location finder service provides the user’s current location or identifies the location of a specified point of interest. The area search system enables users to locate nearby places, such as restaurants or gas stations, by querying the graph database and third-party road data sources.
Knowledge test!
How do you collect the world map data? What third-party source will you use?
How do you segment the map to avoid long loading times?
How do you ensure the accuracy of ETA calculations for high-traffic times of day?
Note: Look at the detailed design of Google Maps to get answers to the questions above.
19. Design a payment gateway like Stripe
Problem statement: Design a payment gateway like Stripe capable of securely performing online or card transactions and handling millions of users simultaneously.
Sample clarifying questions!
What payment types must be supported: cards, wallets, bank transfers?
Is fraud detection and risk analysis built into the platform?
Should the system support multi-currency international transactions?
Requirements
Follow these requirements for the system:
Functional requirements
- User registration and authentication
- Payment processing
- Transaction history
- Balance management
- Mobile accessibility
Nonfunctional requirements
- Performance
- Availability
- Reliability
- Data integrity and security
- Scalability
System Design and workflow
Initially, a customer selects a product or service via the merchant’s online store and proceeds to the checkout page to provide payment details, including card number, cardholder name, CVV or CVC, and expiration date. Upon clicking the pay button, an event that hits the payment service stores the event, performs initial security checks, and forwards the payment details to the payment service provider for further operations is generated. The payment gateway performs extensive security checks, transfers money from the customer’s account to the merchant’s, and provides additional services such as handling refunds and generating invoices. The card network verifies the card information via APIs provided by the card network. Once the payment is processed, the wallet and ledger service updates the merchant’s wallet in the database to track total revenue and processes each order separately in case of multiple sellers. The reconciliation system matches and verifies financial records to ensure accurate transaction accounting, identifying and resolving discrepancies.
Knowledge test!
Where are the customer’s payment details encrypted during the purchase process?
How does the card network authorize a debit/credit card?
Note: Look at the Design a Payment System for a better understanding of the system.
20. Design a Scalable Data Platform for ML/AI Systems
Problem statement: Design a scalable data infrastructure that supports the machine learning lifecycle. The system must ingest data from diverse sources, process and transform it into high-quality features, and serve these features to models for both historical training and low-latency real-time inference, while ensuring consistency between the two.
Sample clarifying questions!
What is the expected latency for serving features during real-time inference?
Do we need to support both streaming data (e.g., clickstreams) and batch data (e.g., daily DB dumps)?
How should the system handle schema changes from upstream data sources?
Requirements
Follow these requirements for the ML Data Platform:
Functional requirements
- Data ingestion
- Data processing
- Feature management
- Dual serving
- Monitoring
Nonfunctional requirements
- Reliability
- Low latency
- Scalability
- Consistency
- Security
System Design and workflow
In a large-scale data infrastructure, data enters the platform through a data ingestion layer that supports dual paths: the message queue buffers real-time events processed by the stream processing engine, while batch loads are orchestrated by a workflow orchestrator.
From there, the data processing layer transforms raw data into clean features. Crucially, this layer acts as a “dual-publisher,” writing the same computed features to two destinations in the feature store layer: an offline store for generating historical training sets, and an online store (Redis/key-value store) for serving the latest values to live models. This architecture ensures that the logic used for training is mathematically identical to that used for inference, solving the “training-serving skew” problem.
Knowledge test!
How does the “dual-store” architecture prevent training-serving skew?
Why is a “schema-on-read” approach preferred for the raw data lake?
Note: To explore in depth and get the answers to the above questions, check out the detailed chapter on the Design of a Scalable Data Infrastructure for AI/ML.
Bonus Design problems
In addition to the above System Design interview questions, here are a couple of more design problems that would help you excel in your System Design interview.
1. Design a food delivery service like Uber Eats or DoorDash
Problem statement: Design a food delivery service like Uber Eats or DoorDash that efficiently connects hungry customers with diverse restaurants, ensuring timely and accurate order fulfillment while optimizing delivery routes and driver earnings.
Sample clarifying questions!
Should the system prioritize delivery speed, cost, or driver fairness?
Is real-time order tracking with driver location required?
Are ratings and reviews needed for restaurants and delivery agents?
Requirements
Follow these requirements for the DoorDash system:
Functional requirements
- Search menu items, cuisines, or restaurants
- Add items to the cart
- Notifications about the order status
- Track the order
- Cancel the order
- Pay for the order
- Create and update the account
- Restaurant profile creation
- Offboarding option (If the restaurant decides to discontinue service)
Nonfunctional requirements
- Latency
- Consistency
- Availability
- High throughput
System Design and workflow
The following is a level design of DoorDash, which consists of several services serving different purposes. Let’s describe the workflow and the interaction of the different services involved in the design.
Customers’ requests are routed through the API gateway and directed to different services via the load balancer. The search service searches for menu items, cuisines, restaurants, etc. It is one of the customers’ busiest services when searching for information on the website or application. The ordering service handles menu selection, managing the shopping cart, and placing food orders. Additionally, it facilitates payment processing through an external payment gateway and stores the outcomes in the relevant database. The order fulfillment service is used to manage the orders that the restaurants have accepted. It also keeps track of orders being prepared.
Customers and restaurant staff use the user management service to create and manage their profiles. The dispatch service displays the orders ready to be picked. It is also used to view delivery information and facilitate communication between customers and restaurant staff.
Knowledge test!
How would you handle a sudden surge in orders during peak hours, like on Super Bowl Sunday?
How would you leverage customer and delivery data to personalize recommendations, improve order accuracy, and optimize pricing?
How would you protect sensitive customer and payment information from breaches?
2. Design a video-first social platform like TikTok
Problem statement: Design a video-first social platform where users can create, upload, watch, and interact with short-form videos (reels). The system should support millions of users, deliver low-latency content, and personalize each user’s video feed based on engagement history.
Sample clarifying questions!
What is the maximum video size and length supported?
Should the video feed be globally personalized or regionally segmented?
Requirements
Follow these requirements for a video-first social platform:
Functional requirements
-
Upload or create short videos
-
Stream short videos
-
Like, comment, and share videos
-
Personalized video feed
-
Follow and unfollow users
-
View creator profiles
-
Search by tags, music, or username
Nonfunctional requirements
-
High availability
-
Low latency streaming
-
Scalability (both storage and delivery)
-
Video processing and compression
System Design and workflow
When users open the app, their request is routed to the feed generation service through a load balancer. This service works with a recommendation service to generate a personalized list of videos based on the user’s watch history, likes, and other interactions.
Once the feed is generated, the app streams video content directly from a content delivery network (CDN) to ensure fast loading times, especially for users in different parts of the world. The videos are stored in a media storage system and processed by a video processing service, which handles compression, format conversion, thumbnail generation, and basic moderation.
When a user uploads a video, it’s routed to the video processing service. After processing, the video is saved to media storage, which can be included in the personalized list for users through the recommendation service.
The following high-level design represents a simple workflow of a video-first social platform like TikTok:
Knowledge test!
How would you handle millions of concurrent users uploading and watching videos?
What strategies would you use to keep the feed relevant and personalized in real time?
How would you moderate inappropriate video content before it reaches viewers?
Conclusion
Mastering these 20 System Design questions will equip you with the foundational knowledge and practical skills needed to excel in technical interviews at top companies. Remember that System Design is about understanding trade-offs, asking clarifying questions, and demonstrating your ability to think systematically about complex problems. Practice these questions regularly, focus on articulating your design decisions clearly, and you’ll build the confidence to tackle any System Design challenge that comes your way.