System Design Interview Questions 2025

System Design interview questions

Over my 10+ years as a systems engineer and hiring manager at Microsoft and Facebook, I led hundreds of software engineer candidates through System Design interviews (SDI). I became deeply familiar with the kinds of software engineer system design interview questions that consistently distinguish top-tier engineers

Surprisingly, even the best developers often struggle with System Design problems. Why? I think it’s because System Design questions can be open-ended, requiring creativity and problem-solving skills not practiced in other coding interview challenges.

While SDI questions tend to evolve, many have remained popular over time. These questions are well-suited to evaluate candidates on two important levels:

Test the candidate’s understanding of System Design fundamentals.
Evaluate the candidate’s ability to apply those fundamentals in real-world applications.

Today, we’ll break down the top 30 System Design interview questions for 2026. These are essential questions asked at top companies like Google, Amazon, Meta, and more. Mastering these problems and their solutions will give you a huge leg up in your System Design interview prep.

Finally, I will leave you with a few battle-tested strategies that you can use to confidently take on any System Design question you encounter.

Grokking Modern System Design Interview

For a decade, when developers talked about how to prepare for System Design Interviews, the answer was always Grokking System Design. This is that course — updated for the current tech landscape. As AI handles more of the routine work, engineers at every level are expected to operate with the architectural fluency that used to belong to Staff engineers. That's why System Design Interviews still determine starting level and compensation, and the bar keeps rising. I built this course from my experience building global-scale distributed systems at Microsoft and Meta — and from interviewing hundreds of candidates at both companies. The failure pattern I kept seeing wasn't a lack of technical knowledge. Even strong coders would hit a wall, because System Design Interviews don't test what you can build; they test whether you can reason through an ambiguous problem, communicate ideas clearly, and defend trade-offs in real time (all skills that matter ore than never now in the AI era). RESHADED is the framework I developed to fix that: a repeatable 45-minute roadmap through any open-ended System Design problem. The course covers the distributed systems fundamentals that appear in every interview – databases, caches, load balancers, CDNs, messaging queues, and more – then applies them across 13+ real-world case studies: YouTube, WhatsApp, Uber, Twitter, Google Maps, and modern systems like ChatGPT and AI/ML infrastructure. Then put your knowledge to the test with AI Mock Interviews designed to simulate the real interview experience. Hundreds of thousands of candidates have already used this course to land SWE, TPM, and EM roles at top companies. If you're serious about acing your next System Design Interview, this is the best place to start.

26hrs

Intermediate

4 Playgrounds

28 Quizzes

How to answer any System Design interview question#

System Design interviews can feel intimidating because they're intentionally open-ended. Unlike coding interviews, there is rarely a single "correct" answer. Whether you're asked to design YouTube, Uber, Netflix, WhatsApp, ChatGPT, or a completely unfamiliar system, the interviewer is evaluating how you think through ambiguity, communicate trade-offs, and make engineering decisions.

This is why strong candidates don't memorize architectures. Instead, they follow a repeatable framework that helps them break large problems into smaller pieces and systematically build a solution.

One framework we recommend is RESHADED, a step-by-step approach that helps you structure your thinking and cover the areas interviewers care about most.

The RESHADED framework#

R → Requirements#

Start by clarifying the problem before drawing any architecture diagrams.

Gather both functional and nonfunctional requirements and establish the scope of the system. Strong candidates ask questions early instead of making assumptions.

Useful questions include:

What features are in scope?
How many users should the system support?
What latency requirements exist?
Is availability more important than consistency?
Are we designing for a global audience?

The goal is to leave this phase with a clear list of requirements that will guide the rest of the design.

E → Estimation#

Next, estimate the scale of the system.

Perform quick back-of-the-envelope calculations for:

Total users
Daily active users
Requests per second
Storage requirements
Bandwidth requirements

For example:

100 million registered users
10 million daily active users
1,000 requests per second on average
50 TB of stored data

These estimates help justify architecture decisions later. There's no need for perfect numbers—reasonable assumptions are enough.

S → System interface#

Define how users and external systems interact with your service.

This usually means identifying major APIs and system boundaries.

Examples:

At this stage, focus on high-level interactions rather than implementation details.

H → High-level design#

Now design the overall architecture.

Identify major components such as:

Load balancers
Application servers
Databases
Caches
Message queues
CDNs
Object storage

A typical discussion might sound like:

User requests first reach a load balancer, which distributes traffic to application servers. Frequently accessed data is served from Redis, while persistent data is stored in a database. Background tasks are handled through a message queue.

This creates the foundation for deeper discussion later.

A → API and data model#

Once the architecture is established, define the system's core entities and storage model.

Examples:

Discuss:

Database schema
SQL vs NoSQL choices
Indexing strategies
Storage requirements

This demonstrates your ability to connect application behavior to data design.

D → Deep dive#

This is the most important part of the interview.

Many candidates spend too much time drawing boxes and not enough time discussing difficult engineering problems.

Possible deep-dive topics include:

Scaling bottlenecks
Replication
Sharding
Caching strategies
Consistency models
Availability requirements
Rate limiting
Fault tolerance

For example, in a messaging system, you might discuss:

How messages are delivered reliably
How offline users receive messages
How conversations are partitioned across servers

This is where strong candidates differentiate themselves.

E → Evaluate trade-offs#

Every architecture decision comes with trade-offs.

Interviewers want to see that you understand them.

Examples include:

Avoid presenting decisions as universally correct. Instead, explain why a particular choice makes sense for the requirements.

D → Discuss improvements#

Finish by discussing how the system could evolve.

Topics may include:

Global scaling
Multi-region deployment
Observability and monitoring
Security enhancements
Disaster recovery
AI-powered features
Cost optimization

This demonstrates senior-level thinking and shows that you're considering long-term growth rather than only the initial implementation.

Mini example: Design a URL shortener#

Let's apply RESHADED to a simple System Design question.

Requirements#

Functional requirements:

Generate short URLs
Redirect users to original URLs

Nonfunctional requirements:

High availability
Low redirect latency

Estimation#

Assume:

100 million URLs stored
10 million redirects per day
Thousands of requests per second

System interface#

Deep dive#

Focus on:

Short code generation
Database indexing
Cache hit rates
Hot URL handling

Trade-offs#

Random IDs vs sequential IDs
SQL vs NoSQL storage
Cache size vs infrastructure cost

Even in a short interview answer, this structure ensures you cover all critical areas.

Common System Design interview mistakes#

Many candidates struggle not because they lack technical knowledge, but because they skip important steps.

Common mistakes include:

Jumping directly into databases
Not asking clarifying questions
Ignoring scale assumptions
Skipping estimations
Failing to discuss trade-offs
Spending too much time on diagrams
Not thinking out loud

Remember: interviewers evaluate your reasoning process, not just your final architecture.

Time allocation for a 45-minute interview#

A good pacing strategy looks like this:

Notice that the largest portion of the interview is spent on the deep dive. This is where most engineering judgment is demonstrated.

The best System Design candidates don't memorize architectures—they follow a repeatable framework for reasoning through unfamiliar problems.

Whether you're designing YouTube, Uber, WhatsApp, Netflix, ChatGPT, or a system you've never seen before, the RESHADED framework helps you structure your thoughts, communicate clearly, and demonstrate the engineering judgment interviewers are looking for.

Tips for any SDI question#

Start each problem by stating what you know: List all required features of the system, common problems you expect to encounter with this sort of system, and the traffic you expect the system to handle. The listing process lets the interviewer see your planning skills and correct misunderstandings before you begin the solution.
Narrate any trade-offs: Every System Design choice matters. At each decision point, list at least one positive and one negative effect of that choice.
Ask your interviewer to clarify: Most System Design questions are purposefully vague. Ask clarifying questions to show the interviewer how you view the question and your knowledge of the system’s needs. Also, be sure to state your assumptions before diving into the components.
Know your architectures: Most modern services are built upon a flexible microservice architecture. Unlike the past’s monolithic architectures of tech companies, microservices allow smaller, agile teams to build independently from the larger system. Some older companies will have legacy systems, but microservices can function in parallel to legacy code and help refresh the company’s architecture.
Discuss emerging technologies: Conclude each question with an overview of how and where the system could benefit from generative AI (GenAI) and machine learning (ML). This will demonstrate that you’re prepared for not only current solutions but also future solutions.

Grokking the Generative AI System Design

GenAI System Design is emerging as its own interview category at top tech companies, distinct from traditional ML System Design. The questions are different, the architectures are different, and the scale considerations (GPU compute, parallelism, inference optimization) require their own mental models. Having spent years researching adaptive AI systems and neural networks, and now leading the creation of learning content at Educative, I designed this course to bridge that gap between understanding generative AI conceptually and being able to architect these gen AI systems end-to-end. You'll learn the SCALED framework, which is a 6-step methodology for breaking down any GenAI System Design problem, then apply it across five real-world systems spanning text, image, speech, and video generation. Each case study walks through training architecture, deployment design, and the specific tradeoffs involved in that modality. Before diving into the case studies, the course covers the foundational concepts you'll need: neural networks, transformers, tokenization, embeddings, parallelism strategies, inference optimization, RAG, and fine-tuning. You'll also learn how to do back-of-the-envelope calculations for LLM training and deployment. A bonus: if you have a GenAI or ML System Design interview coming up, this will give you both the framework and the depth to handle whatever systems are asked to design.

4hrs

Intermediate

8 Exercises

8 Quizzes

Now, let’s examine the specifics of the top System Design interview questions, starting with the easy problems.

Easy System Design interview questions#

I provide a problem statement, requirements, and workflow for each question with a high-level design.

1. Design an API rate limiter for sites like Firebase or GitHub#

Problem statement: Design an API rate limiter that caps the number of API calls the service can receive in a given period to avoid an overload.

Sample clarifying questions!

Which entity is rate-limited: user, IP, token, or API key?
Are the rate-limiting rules configurable at runtime?
What is the expected scale in requests per second?

Requirements#

Follow these requirements for a rate limiter system:

System Design and workflow#

According to the following high-level rate limiter, the client’s requests are passed through an ID builder, which assigns unique IDs to the incoming requests. The ID could be a remote IP address, login ID, or other attributes. The decision maker fetches the throttling rules from the database and decides according to them. It either forwards the requests to application servers via the requests processor or discards them and provides the client an error message (429 Too many requests). If some requests are throttled due to a system overload, the system keeps those requests in a queue to be processed later.

Knowledge test!

How does your system measure requests per minute? If a user makes 10 requests at 00:01:20 and then another 10 at 00:02:10, they’ve made 20 in the same one-minute window despite the minute change.
In the event of a failure, a rate limiter would be unable to perform the task of throttling. Should the request be accepted or rejected in such a scenario?
What changes would you make to the design while considering the rate limiter design for a distributed system rather than a local one?

Note: Look at the detailed design of the rate limiter to find the answers to the questions above.

2. Design a pub/sub system like Kafka#

Problem statement: Design a scalable and distributed pub/sub system like Kafka that can handle massive message throughput. It should also ensure reliable message delivery and support various messaging semantics (at most once, at least once, exactly once).

Sample clarifying questions!

What message delivery guarantee is required: at-most-once, at-least-once, or exactly-once?
Is message ordering important within topics or partitions?
How long should messages be retained in the system?

Requirements#

Follow these requirements for the pub/sub design:

Knowledge test!

How can message delivery be ensured and semantics guaranteed at least once or at most once in the pub/sub design?
How can you guarantee message orders for specific consumers?

Note: To answer the above technical questions, you can examine the detailed design of pub/sub.

3. Design a URL-shortening service like TinyURL or bit.ly#

Problem statement: Design a scalable and distributed system that shortens long URLs like TinyURL or bit.ly. The system takes a long URL and generates a new, unique short URL. It should also take a shortened URL and return the original full-length URL.

Sample clarifying questions!

Should shortened URLs be globally unique or user-specific?
Are custom aliases supported, and how are collisions handled?
Do URLs expire, or are they stored permanently?

Requirements#

Follow these requirements for the URL-shortening system:

System Design and workflow#

A load balancer is the first intermediary between the clients and the server, ensuring even distribution of incoming requests to maintain availability and reliability. When a new URL-shortening request comes in, the load balancer forwards it to a server where the rate limiter checks if the client is within the allowed request rate.

The server leverages a sequencer to generate a unique numeric ID for the URL requests. This ID is passed to an encoder, which converts it into a more readable alphanumeric string. The original URL and its corresponding shortened version are stored in a database. To enhance performance, recently accessed URLs are kept in a cache, allowing quick retrieval without repeatedly querying the database.

Knowledge test!

What if two users input the same custom URL?
What if there are more users than expected?
How does the database regulate storage space?

Note: To explore in depth to get the answer to the above questions, check out the detailed chapters on the TinyURL System Design.

4. Design a scalable content delivery network (CDN)#

Problem statement: Design a scalable content delivery network (CDN) system to efficiently distribute and cache content across globally distributed servers, minimizing latency and ensuring reliable end user content delivery.

Sample clarifying questions!

What types of content will the CDN serve: static, dynamic, or both?
What is the regional traffic distribution and expected scale?

Requirements#

Follow these requirements for a CDN system:

System Design and workflow#

When a client requests content, a request routing system kicks in to find the address of the nearest or fastest server, ensuring minimal wait time. A load balancer then routes the request to the optimal server. If the requested content is cached on that server, it is immediately delivered to the client. If not, the server fetches the content from the origin server, caches it locally for more such requests, and then serves it to the user.

The CDN system ensures that frequently accessed content remains readily available while less popular content is periodically purged. The system also includes monitoring and analytics to track performance, optimize routing, and ensure high availability and reliability.

Knowledge test!

How would you determine which content to cache on edge servers?
How would you distribute traffic evenly across multiple edge servers?
How would you ensure the CDN infrastructure’s scalability, availability, and fault tolerance?
How would you optimize the delivery and reduce the latency while streaming?

Note: Check out the chapter on the design of a content delivery network to help you understand and get answers to the above questions.

5. Design a web crawler#

Problem statement: Design a web crawler that systematically browses the internet to discover and index web pages. The crawler should efficiently navigate websites, retrieve content, and follow links to discover new pages.

Sample clarifying questions!

Should the crawler extract media content like images and videos or only HTML?
Should the crawler obey robots.txt and crawl-delay rules?
What is the depth and frequency of crawl required per domain?

Requirements#

Follow these requirements for the web crawler system:

Knowledge test!

What functionalities must be added to extract all formats (images and video)?
Real web crawlers have multiple workers handling separate URLs simultaneously. How does this change the queuing process?
How can you account for crawler traps?

Note: To get the answers to the above questions, check out the detailed chapters on the web crawler System Design.

6. Design a distributed cache#

Problem statement: Design a distributed caching system that provides fast, scalable, and reliable data retrieval across multiple servers. The system should efficiently manage cache consistency, handle high volumes of read and write requests, ensure data availability, and provide mechanisms for cache eviction and expiration.

Sample clarifying questions!

What should be the typical read-to-write ratio in expected workloads?
Should the cache support write-through or write-back strategies?
Will the cache operate across regions or within a single data center?

Requirements#

Follow these requirements for the distributed cache system:

System Design and workflow#

A distributed caching system begins by partitioning the data across multiple cache nodes to balance the load and improve access speed. When a client requests data, an application server determines the appropriate cache node based on a consistent hashing algorithm, ensuring an even distribution of requests and quick lookups.

If the data is found in the cache (a cache hit), it is returned to the client immediately, significantly reducing latency. If the data is not found (a cache miss), the system retrieves it from the primary data store, caches it, and then serves it to the client. Cache eviction policies, such as least recently used (LRU) or time-to-live (TTL), manage the removal of stale data to free up space.

Knowledge test!

How do you ensure data consistency across multiple cache nodes, especially during updates and deletions?
What strategies can be implemented to handle cache misses efficiently without overloading the primary data store?
What methods can maintain low latency and high throughput under heavy load conditions?
How do you secure the cache data against unauthorized access and ensure privacy?

Note: To answer such conceptual questions, check out the detailed design of the distributed cache.

7. Design an authentication and SSO platform like Auth0#

Problem statement: Design a secure, scalable, multi-tenant authentication platform that provides identity and access management as a service, similar to Auth0. The system must support user registration, multiple authentication methods, and seamless single sign-on (SSO) across various client applications.

Sample clarifying questions!

Is single sign-on required across domains or only within one?
Should the platform support both B2C and B2B (multi-tenant) models?

Requirements#

Follow these requirements for the authentication and SSO platform:

System Design and workflow#

When users want to log in, their request is routed through a load balancer and then sent to an authentication server. The system checks whether the user is signing in with a regular email and password or using a third-party login provider. If it’s a third-party login, the system redirects the user to the external provider for verification. Once the login is successful, the authentication service creates a secure token and returns it to the client.

This token acts like a digital badge and is used to identify the user on future requests. It can also be used across multiple applications owned by the same company, enabling single sign-on (SSO). Each token has a built-in expiration time, ensuring that user sessions do not remain active indefinitely.

The authentication system includes important safeguards, such as limiting failed login attempts, detecting suspicious activity, and securely encrypting all stored passwords. It also supports multi-tenancy, meaning user data is kept separate for each business using the platform, so each company only sees its users.

Knowledge test!

How do we safely store user passwords in the system?
How does the system recognize users across multiple apps (SSO)?
What happens if a third-party login service like Google is temporarily down?
How do we prevent too many failed login attempts from the same user or IP address?
How would the system handle logout and session expiration?

Note: The lesson on authentication and authorization explores the core concepts behind this system, including tokens, login protocols, and user permissions.

Medium System Design interview questions#

I provide each medium system design question’s problem statement, requirements, workflow, and system architecture.

Problem statement: Design a video-first social platform where users can create, upload, watch, and interact with short-form videos (reels). The system should support millions of users, deliver low-latency content, and personalize each user’s video feed based on engagement history.

Sample clarifying questions!

What is the maximum video size and length supported?
Should the video feed be globally personalized or regionally segmented?

Requirements#

Follow these requirements for a video-first social platform:

System Design and workflow#

When users open the app, their request is routed to the feed generation service through a load balancer. This service works with a recommendation service to generate a personalized list of videos based on the user’s watch history, likes, and other interactions.

Once the feed is generated, the app streams video content directly from a content delivery network (CDN) to ensure fast loading times, especially for users in different parts of the world. The videos are stored in a media storage system and processed by a video processing service, which handles compression, format conversion, thumbnail generation, and basic moderation.

When a user uploads a video, it’s routed to the video processing service. After processing, the video is saved to media storage, which can become part of the personalized list for users via the recommendation service.

The following high-level design represents a simple workflow of a video-first social platform like TikTok:

Knowledge test!

How would you handle millions of concurrent users uploading and watching videos?
What strategies would you use to keep the feed relevant and personalized in real time?
How would you moderate inappropriate video content before it reaches viewers?

Note: The chapter on the content delivery network explores how content is delivered quickly and efficiently to users worldwide.

9. Design an AI-powered customer support platform#

Problem statement: Design a scalable customer support platform for a large e-commerce business. The system should use a collection of specialized AI agents to automatically understand, route, and resolve customer queries in real time. If the issue isn’t resolved automatically, the system should escalate it to a human agent with full context preserved.

Sample clarifying questions!

Which channels should be supported (chat, voice, email)?
Should the system support multilingual AI interactions?
Should users be authenticated before submitting a query?

Requirements#

Follow these requirements for the AI-powered customer support platform:

System Design and workflow#

When a customer submits a query, it is first received by a query router. This component classifies the query type, such as billing, FAQ, or technical issue, based on message content and customer context. The query is then forwarded to the appropriate specialized AI agent.

The FAQ agent retrieves standard responses from the knowledge base to answer common customer questions. For issues related to payments or orders, the billing agent securely accesses account details to provide accurate, account-specific resolutions. Meanwhile, the technical agent helps customers troubleshoot app or product-related problems by walking them through guided solutions.

If the assigned AI agent resolves the query, a response is sent back to the user. If not, the human escalation manager transfers the case to a human support agent with the full interaction history attached. Similarly, a monitoring and logging service records all activity to track performance, generate insights, and help improve future responses, as shown in the following illustration:

Knowledge test!

Why might using multiple specialized AI agents be better than one large general-purpose model?
How should the system decide when to escalate a query to a human agent?
How can the query router be made fault-tolerant in case of misclassification?
What steps are needed to add a new agent, and how would you ensure it doesn’t interfere with others?

10. Design a chat service like Facebook Messenger or WhatsApp#

Problem statement: Design a scalable, reliable, and secure real-time chat service like Facebook Messenger or WhatsApp to support instant messaging, group chats, notifications, and multimedia sharing.

Sample clarifying questions!

Should the system support both one-to-one and group chats?
Are messages required to be end-to-end encrypted?
Should messages be stored indefinitely or have a retention policy?

Requirements#

Follow these requirements for the WhatsApp System Design:

System Design and workflow#

In a real-time communication system, senders and receivers are connected to chat servers. Chat servers deliver messages from sender to receiver via a messaging queue. Various protocols, such as WebSocket, XMPP, MQTT, and real-time transport protocol, can be utilized for real-time communication. For this purpose, a manager establishes real-time connections between clients and chat servers; for instance, assume the WebSocket manager establishes WebSocket connections between users and different chat servers. Similarly, the messages can be persistently stored in the database.

Knowledge test!

What happens if a message is sent when the user isn’t connected to the internet? Is it sent when the connection is restored?
How will you encrypt and decrypt the message without increasing latency?
How do users receive notifications?
Are messages pulled from the device (the server periodically prompts the devices if they’re waiting to send a message), or are pushed to the server (the device prompts the server that it has a message to send)?

Note: Look at the detailed design of the real-time chat service to get answers to such questions.

Problem statement: Design a social media service used by several million users like Instagram. Users should be able to view a newsfeed with posts by following users and suggesting new content that the user may like.

Sample clarifying questions!

Should feed generation be on write, on read, or hybrid?
Should the system support images, video, or only text content?
How personalized should the user feed be?

Requirements#

Follow these requirements for the Instagram system:

Based on the above requirements, let’s create a high-level design of a feed-based social system like Instagram.

System Design and workflow#

The high-level design of a feed-based social network includes posts, timeline generation, feed publishing service, and feed ranking and recommendation engine. The post-service handles the clients’ posts, and the post is published on the client’s wall (page). Similarly, the timeline generation service generates feeds for friends and followers by the timeline generation service. The timeline generation service utilizes the feed ranking and recommendation engine, which ranks and recommends the top N posts to followers based on their interests, searches, and watch history. The generated feed is stored in the database, and the feed publishing service is responsible for publishing and showing the generated feeds to followers. As the feed could contain videos, the CDN is responsible for delivering the videos to followers with low latency.

Knowledge test!

Influencers or celebrities will have millions of followers; how are they handled vs. standard users?
How does the system weigh posts by age? Old posts are less likely to be viewed than new posts.
What’s the ratio of read and write focused nodes? Are there likely to be more read requests (users viewing posts) or write requests (users creating posts)?
How can you increase availability? How does the system update? What happens if a node fails?
How do you efficiently store posts and images?

Note: Look at the detailed design of Instagram for a better understanding.

12. Design a proximity service like Yelp or nearby places/friends#

Problem statement: Design a proximity server that stores and reports the distance to places like restaurants. Users can search nearby places by distance or popularity. The database must store data for hundreds of millions of businesses across the globe.

Sample clarifying questions!

How should results be sorted: by distance, rating, or popularity?
Is real-time tracking needed for friends or businesses?

Requirements#

Follow these requirements for a System Design like Yelp:

System Design and workflow#

The system handles search requests by using load balancers to distribute read requests to the read service, which then queries the quadtree service to identify relevant places within a specified radius. The quadtree service also refines the result before being sent to the clients. For adding places or feedback, write requests are similarly routed through load balancers to the writing service, which updates a relational database and stores images in blob storage. The system also involves segmenting the world map into smaller parts, storing places in a key-value store, and periodically updating these segments to include new places, although this update happens monthly due to the low probability of new additions.

Knowledge test!

How do you store lots of data and retrieve search results quickly?
How should the system handle different population densities? RigidThis is a fixed, inflexible grid structure, which is used to divide geographic space into latitude and longitude coordinates. latitude/longitude grids will cause varied responsiveness based on density.
Can we optimize commonly searched locations?

Note: Look at the detailed design of Yelp to get answers to the above questions.

Problem statement: Design a typeahead suggestion system that provides real-time, relevant autocomplete and autocorrect suggestions as users type, ensuring low latency and scalability to efficiently handle a large volume of queries.

Sample clarifying questions!

What is the maximum allowed latency for suggestions?
Should the system adapt to user search history and preferences?
How often should the autocomplete dataset be refreshed?

Requirements#

Follow these requirements for the system:

System Design and workflow#

When a user starts typing a query, each character is sent to an application server. A suggestion service gathers the top N suggestions from a distributed cache, or Redis, and returns the list to the user. An alternate service, the data collector and aggregator, takes the query, analytically ranks it, and stores it in a NoSQL database. The trie builder is a service that takes the aggregated data from the NoSQL database, builds tries, and stores them in the trie database.

Knowledge test!

How strongly do you weigh spelling mistake corrections?
How do you update selections without causing latency?
How do you determine the most likely completed query? Does it adapt to the user’s searches?
What happens if the user types very quickly? Do suggestions only appear after they’re done?

Note: Look at the detailed design of the Typeahead system for a better understanding of the system.

14. Design a video streaming service like YouTube or Netflix#

Problem statement: Design a video streaming service like YouTube or Netflix that allows users to upload and stream videos. The service should efficiently store many videos and their metadata and return accurate and quick results for user search queries.

Sample clarifying questions!

What is the expected volume of uploads and concurrent streams?
Are live streaming features needed, or only on-demand?

Requirements#

Follow these requirements for a streaming service System Design:

System Design and workflow#

A load balancer first handles video upload requests by sending them to the application servers. The applications server interacts with the video service, which triggers transcoders to convert the video to different formats. These typically range from 140p to 1440p but can reach 4K resolutions. The formatted video is then saved to the blob store, and its metadata is stored on the metadata database. The video service sends the transformed video to CDNs for quick content delivery to end users. Popular and recent uploads are held in a CDN. A content delivery network, or CDN, reduces latency when delivering video to users. The CDN stores and delivers requested data to users in conjunction with colocation sites.

Knowledge test!

How will your service ensure smooth video streaming on various internet qualities?
How are the videos stored?
How will the system provide a personalized experience to each user with recommendations?
How does the system react to a sudden drop in the network, shifting to low-quality, buffering content, etc.?

Note: Check out the detailed chapter on YouTube System Design that answers the above concerns during the design.

Problem statement: Design a system for a ride sharing service similar to Uber, where users can request rides and drivers can accept these requests. The system should efficiently match drivers to riders based on location and availability, handle real-time updates on ride statuses, manage payments securely, and ensure a smooth user experience from booking to completion of the ride.

Sample clarifying questions!

Should the system support different ride types (economy, premium, carpool)?
How frequently are driver and rider locations updated?
Are wallet systems, promotions, or refunds part of the payment system?

Requirements#

Follow these requirements for the System Design:

System Design and workflow#

A user’s request is sent to the application server via a load balancer and API gateway. The system accepts the rider’s request, and the trip service or manager provides an estimated time of arrival (ETA) based on different vehicle types. The drivers and location manager use a matching algorithm to find the nearest available drivers and send the request to those drivers by notifying them via a notification service. When a driver matches with a rider, the application should return the trip and rider information. The driver’s location is regularly recorded and communicated to relevant users through a pub/sub service.

Once the ride is complete, the trip manager ensures payment is securely processed through a payment gateway. We leverage a database that stores user and driver profiles, ride history, and payment information. We also use caching mechanisms to speed up access to frequently requested data, and constant monitoring ensures the service runs smoothly.

Knowledge test!

How can you keep latency low during busy periods?
How is the driver paired with the user? Iterating over all drivers to find Euclidean distance would be inefficient.
What happens if the driver or user loses connection?
How would you update the ETA during a ride in peak hours?

Note: Check out our guide to designing Uber’s backend for more information on the interview process.

16. Design a recommendation service#

Problem statement: Design a recommendation engine that suggests personalized content or products to users based on their preferences and behavior. The system should efficiently analyze user data, such as past interactions and ratings, to provide accurate and relevant recommendations.

Sample clarifying questions!

What types of content are being recommended (products, videos, etc.)?
Should recommendations be personalized or globally ranked?
Should updates happen in real time or batch processing?

Requirements#

Follow these requirements for a recommendation service:

System Design and workflow#

The recommendation engine’s System Design comprises data collection, processing, and recommendation. When users interact with the application, the data collector service collects data from application servers, such as search, viewing history, ratings, watch times, etc. This data is logged into Kafka for immediate processing.

We use real-time processors to process data and recommend content accordingly. We also use batch processors for periodic offline processing to perform detailed analyses and improve accuracy. Once the data is processed, the ML/AI engine uses different algorithms, such as collaborative filtering, content-based filtering, hybrid approaches, and advanced techniques to recommend personalized suggestions.

Integrating AI to enhance the user experience is crucial to modern applications. Learn how to build generative AI applications in our course: Grokking the Generative AI System Design.

Knowledge test!

How will you handle the cold start problem for new users and content?
How would you update recommendations in real time?
How would you ensure the recommendation system scales to ever-increasing users?
What strategies would you employ to adjust recommendations dynamically based on real-time user behavior or preference changes?
How can you optimize recommendation accuracy without compromising on scalability and performance?

Problem statement: Design a scalable, synchronous, cross-platform storage system like Dropbox. Users can store files and photos and access them from other devices.

Sample clarifying questions!

What is the maximum file size supported for upload/download?
Should real-time collaboration be supported or only file syncing?
Are user storage quotas or expiration policies needed?

Requirements#

Follow these requirements for the system:

System Design and workflow#

In a high-level design of a file sharing service like Google Drive, the user’s request to upload or download a file passes through a load balancer to the application servers. The application server sends the upload request to a chunk service for splitting large files into smaller, more easily manageable chunks. These files are then sent to a processing queue that sends and receives requests to store metadata and ensure that files are synchronized between users and accounts. Files are stored in a cloud-based block storage platform, like Amazon S3 (or in-premises blob storage). Users who want to upload or download files contact this storage service through a web server.

Knowledge test!

Where are the files stored?
How do you handle updates? Do you re-upload the entire file again?
Do small updates require a full file update?
How does the system handle two users updating a document simultaneously?

Note: To further your learning, explore the detailed design of distributed file systems of tech giants like Google and Facebook (Meta).

Problem statement: These social network sites operate on a forum-based system that allows users to post questions and links. For simplicity’s sake, focus more on designing Quora. You’ll unlikely need to walk through the design of something like Reddit’s subreddit or karma system in an interview.

Sample clarifying questions!

What types of content are supported: text, images, videos, links?
Should voting affect visibility globally or per user?
Are real-time notifications required for interactions?

Requirements#

Follow these requirements for a System Design like Quora:

System Design and workflow#

In Quora’s high-level design, users interact through a web server, which communicates with an application server to handle actions such as posting questions, answers, and comments. Content like images and videos is stored in blob storage, and question-and-answer data, along with user profiles and interactions, are stored in a MySQL database.

A machine learning engine analyzes user interactions and content to rank answers based on relevance and quality. This engine continuously learns from user feedback to improve its ranking algorithms. For personalized user experiences, a recommendation system utilizes machine learning models to tailor content based on individual interests and behaviors.

Knowledge test!

How can you ensure the system’s scalability to handle millions of simultaneous users posting questions and answers?
What strategies can efficiently store and retrieve large multimedia content in blob storage?
How would you design the database schema to manage the relationships between users, questions, answers, and comments in a scalable way?
What techniques can be used to rank answers effectively, ensuring that high-quality content is prioritized for users?
How can you optimize the performance of the machine learning engine to rank answers quickly and accurately?

Note: Check out the detailed chapter on Quora System Design to help you understand the system.

Hard System Design interview questions#

Hard System Design interview questions refer to complex, open-ended problems that require deep technical knowledge, critical thinking, and the ability to design scalable, efficient systems under constraints. Let’s start with the System Design of a ChatGPT-style service.

19. Design a ChatGPT-style service#

Problem statement: Design a scalable and interactive conversational AI platform, similar to ChatGPT, that allows users to submit prompts and receive real-time, coherent responses from a large language model (LLM). The system should support millions of users, maintain conversation history, and deliver a fast and responsive experience.

Sample clarifying questions!

Should conversation history persist across sessions?
How important is streaming speed vs. final accuracy?
Is personalization (tone, memory) required in responses?

Requirements#

Follow these requirements for the ChatGPT-style service:

System Design and workflow#

When a user submits a prompt, the request is initially routed to the API gateway, which verifies authentication and applies rate limiting. The authorized request is then forwarded to the conversation manager. The conversation manager retrieves recent conversations from the user’s chat history, if available, and combines them with the new prompt to create the full input for the LLM. This input is sent to the LLM inference service, which may use distributed replicas or sharded models to manage high traffic.

The LLM inference service generates a response. To improve the user experience, the response is streamed back to the client as it is being generated. This makes the interaction feel fast and natural, especially for longer responses. Once the full response is ready, it is saved to the session store alongside the user’s prompt. All interactions are logged for monitoring and future improvements.

To handle peak traffic smoothly, a queue can be introduced before the inference step to buffer requests. Optionally, depending on the use case, the system may include user profile data for personalization, such as adapting tone, language, or preferred length of answers.

The following illustrations show a high-level design of a ChatGPT-style service:

Grokking the Generative AI System Design

GenAI System Design is emerging as its own interview category at top tech companies, distinct from traditional ML System Design. The questions are different, the architectures are different, and the scale considerations (GPU compute, parallelism, inference optimization) require their own mental models. Having spent years researching adaptive AI systems and neural networks, and now leading the creation of learning content at Educative, I designed this course to bridge that gap between understanding generative AI conceptually and being able to architect these gen AI systems end-to-end. You'll learn the SCALED framework, which is a 6-step methodology for breaking down any GenAI System Design problem, then apply it across five real-world systems spanning text, image, speech, and video generation. Each case study walks through training architecture, deployment design, and the specific tradeoffs involved in that modality. Before diving into the case studies, the course covers the foundational concepts you'll need: neural networks, transformers, tokenization, embeddings, parallelism strategies, inference optimization, RAG, and fine-tuning. You'll also learn how to do back-of-the-envelope calculations for LLM training and deployment. A bonus: if you have a GenAI or ML System Design interview coming up, this will give you both the framework and the depth to handle whatever systems are asked to design.

4hrs

Intermediate

8 Exercises

8 Quizzes

20. Design a code deployment system#

Problem statement: Design a reliable and scalable code deployment system for a large-scale distributed application. The system should automate building, testing, and rolling out code changes across environments with minimal disruption and the ability to monitor and roll back changes when necessary.

Sample clarifying questions!

What rollback strategy is required: full, partial, or per environment?
Is deployment approval manual, automated, or both?

Requirements#

Follow these requirements for a code deployment system:

System Design and workflow#

The high-level design of the code deployment system includes all the major components needed to meet the outlined requirements. The process begins when developers submit code to a version control system (VCS). Any new code changes trigger a continuous integration (CI) service, which automatically integrates updates, runs preliminary tests, and prepares the code for deployment. Once validated, the code is published to a queue, which decouples build triggers from execution.

A dedicated build service listens to this queue and retrieves jobs to compile the code. It then generates binary artifacts and stores them in a versioned blob storage system. These artifacts represent the system’s deployable output. When it’s time to deploy, the deployment service pulls the necessary artifacts from blob storage and installs them on machines across different regions. This ensures consistent deployments in multiple environments, such as staging and production.

The architecture supports gradual rollouts, rollback mechanisms, and monitoring at each step, helping to reduce risks and improve reliability in production.

A high-level design of a code deployment system is depicted in the following illustration:

Grokking Modern System Design Interview

For a decade, when developers talked about how to prepare for System Design Interviews, the answer was always Grokking System Design. This is that course — updated for the current tech landscape. As AI handles more of the routine work, engineers at every level are expected to operate with the architectural fluency that used to belong to Staff engineers. That's why System Design Interviews still determine starting level and compensation, and the bar keeps rising. I built this course from my experience building global-scale distributed systems at Microsoft and Meta — and from interviewing hundreds of candidates at both companies. The failure pattern I kept seeing wasn't a lack of technical knowledge. Even strong coders would hit a wall, because System Design Interviews don't test what you can build; they test whether you can reason through an ambiguous problem, communicate ideas clearly, and defend trade-offs in real time (all skills that matter ore than never now in the AI era). RESHADED is the framework I developed to fix that: a repeatable 45-minute roadmap through any open-ended System Design problem. The course covers the distributed systems fundamentals that appear in every interview – databases, caches, load balancers, CDNs, messaging queues, and more – then applies them across 13+ real-world case studies: YouTube, WhatsApp, Uber, Twitter, Google Maps, and modern systems like ChatGPT and AI/ML infrastructure. Then put your knowledge to the test with AI Mock Interviews designed to simulate the real interview experience. Hundreds of thousands of candidates have already used this course to land SWE, TPM, and EM roles at top companies. If you're serious about acing your next System Design Interview, this is the best place to start.

26hrs

Intermediate

4 Playgrounds

28 Quizzes

System Design and workflow#

In the following high-level design of a newsfeed system, clients post or request their newsfeed through the app, which the load balancer redirects to a web server for authentication and routing. Whenever a post is created via the post service and available from a user’s friends (or followers), the notification service informs the newsfeed generation service, which generates newsfeeds from the posts of the user’s friends (followers) and keeps them in the newsfeed cache. Similarly, the generated feeds are published by the newsfeed publishing service to the user’s timeline from the news feed cache. It also appends multimedia content from the blob storage with a news feed if required.

Knowledge test!

Creating and storing newsfeeds for each user in the cache requires enormous memory. Is there any way to reduce this memory consumption?
What mechanisms would you implement to prioritize and filter content in the newsfeed to prevent information overload for users?
How can the system ensure consistency and order of posts in the newsfeed, especially in a distributed environment with multiple data centers?

Note: If you need answers to such questions, look at the detailed design of a newsfeed service.

22. Design a collaborative editing service like Google Docs #

Problem statement: Design a collaborative editing service that lets users remotely and simultaneously make changes to text documents. The changes should be displayed in real time. Like other cloud-based services, documents should be consistently available to any logged-in user on any machine. Your solution must be scalable to support thousands of concurrent users.

Sample clarifying questions!

What collaboration model is used: character-level or paragraph-level?
Should the system support offline editing and later sync?
How will conflicts between concurrent edits be resolved?

Requirements#

Follow these requirements for the Google Docs system:

System Design and workflow#

Clients’ requests are forwarded to the operations queue, where conflicts are resolved between different collaborators, and the data is stored in the time series database and blob storage (responsible for storing media files). Autocomplete suggestions are made via the typeahead service. This service resides on the Redis cache to enable low latency suggestions and enhance the speed of the regular updates process. The application servers perform several important tasks, including importing and exporting documents. Application servers also convert documents from one format to another. For example, a .doc or .docx document can be converted into .pdf or vice versa.

Knowledge test!

How do you minimize latency when multiple users are distant from the server?
What techniques for conflict resolution are best for ensuring consistency?

Note: If you need answers to such questions, look at the detailed design of Google Docs.

23. Design Google Maps#

Problem statement: Design a service that can map the route between two locations. The system should map several optimal paths to a destination based on the mode of travel. Each route should display the total mileage and an estimated time of arrival.

Sample clarifying questions!

What travel modes should be supported: driving, cycling, walking, public transport?
How frequently should traffic data be updated?
Is offline route planning and navigation required?

Requirements#

Follow these requirements for the Google Maps system:

System Design and workflow#

In the Google Maps system, clients request location-based services, such as finding a route or searching for nearby points of interest. The load balancer directs requests to various services based on the nature of the query.
For routing requests, the route finder service calculates optimal paths between two or more points using real-time and historical data. It relies on the graph processing service to perform complex calculations on the road network graph stored in the graph database. The location finder service provides the user’s current location or identifies the location of a specified point of interest. The area search system lets users find nearby places, such as restaurants or gas stations, by querying the graph database and third-party road data sources.

Knowledge test!

How do you collect the world map data? What third-party source will you use?
How do you segment the map to avoid long loading times?
How do you ensure the accuracy of ETA calculations for high-traffic times of day?

Note: Look at the detailed design of Google Maps to get answers to the questions above.

24. Design a payment gateway like Stripe#

Problem statement: Design a payment gateway like Stripe capable of securely performing online or card transactions and handling millions of users simultaneously.

Sample clarifying questions!

What payment types must be supported: cards, wallets, bank transfers?
Is fraud detection and risk analysis built into the platform?
Should the system support multi-currency international transactions?

Requirements#

Follow these requirements for the system:

System Design and workflow#

Initially, a customer selects a product or service via the merchant’s online store and proceeds to the checkout page to provide payment details, including card number, cardholder name, CVV or CVC, and expiration date. Upon clicking the pay button, an event that hits the payment service stores the event, performs initial security checks, and forwards the payment details to the payment service provider for further operations is generated. The payment gateway performs extensive security checks, moves money from the customer’s account to the merchant’s, and provides secondary services like handling refunds and generating invoices. The card network verifies the card information via APIs provided by the card network. Once the payment is processed, the wallet and ledger service updates the merchant’s wallet in the database to track total revenue and processes each order separately in case of multiple sellers. The reconciliation system matches and verifies financial records to ensure accurate transaction accounting, identifying and resolving discrepancies.

Grokking Modern System Design Interview

For a decade, when developers talked about how to prepare for System Design Interviews, the answer was always Grokking System Design. This is that course — updated for the current tech landscape. As AI handles more of the routine work, engineers at every level are expected to operate with the architectural fluency that used to belong to Staff engineers. That's why System Design Interviews still determine starting level and compensation, and the bar keeps rising. I built this course from my experience building global-scale distributed systems at Microsoft and Meta — and from interviewing hundreds of candidates at both companies. The failure pattern I kept seeing wasn't a lack of technical knowledge. Even strong coders would hit a wall, because System Design Interviews don't test what you can build; they test whether you can reason through an ambiguous problem, communicate ideas clearly, and defend trade-offs in real time (all skills that matter ore than never now in the AI era). RESHADED is the framework I developed to fix that: a repeatable 45-minute roadmap through any open-ended System Design problem. The course covers the distributed systems fundamentals that appear in every interview – databases, caches, load balancers, CDNs, messaging queues, and more – then applies them across 13+ real-world case studies: YouTube, WhatsApp, Uber, Twitter, Google Maps, and modern systems like ChatGPT and AI/ML infrastructure. Then put your knowledge to the test with AI Mock Interviews designed to simulate the real interview experience. Hundreds of thousands of candidates have already used this course to land SWE, TPM, and EM roles at top companies. If you're serious about acing your next System Design Interview, this is the best place to start.

26hrs

Intermediate

4 Playgrounds

28 Quizzes

25. Design a food delivery service like Uber Eats or DoorDash#

Problem statement: Design a food delivery service like Uber Eats or DoorDash that efficiently connects hungry customers with diverse restaurants, ensuring timely and accurate order fulfillment while optimizing delivery routes and driver earnings.

Sample clarifying questions!

Should the system prioritize delivery speed, cost, or driver fairness?
Is real-time order tracking with driver location required?
Are ratings and reviews needed for restaurants and delivery agents?

Requirements#

Follow these requirements for the DoorDash system:

System Design and workflow#

The following is a level design of DoorDash, consisting of several services for different purposes. Let’s describe the workflow and the interaction of the different services involved in the design.

Customers’ requests are routed through the API gateway and directed to different services via the load balancer. The search service searches for menu items, cuisines, restaurants, etc. It is one of the customers’ busiest services when searching the website or application. The ordering service handles menu selection, managing the shopping cart, and placing food orders. Additionally, it facilitates payment processing through an external payment gateway and stores the outcomes in the relevant database. The order fulfillment service is used to manage the orders that the restaurants have accepted. It also keeps track of orders being prepared.

Customers and restaurant staff use the user management service to create and manage their profiles. The dispatch service displays the orders ready to be picked. It is also used to view delivery information and facilitate communication between customers and restaurant staff.

Knowledge test!

How would you handle a sudden surge in orders during peak hours, like on Super Bowl Sunday?
How would you leverage customer and delivery data to personalize recommendations, improve order accuracy, and optimize pricing?
How would you protect sensitive customer and payment information from breaches?

26. Design a distributed locking service like Google Chubby locking#

Problem statement: Design a highly available, fault-tolerant distributed locking service like Google Chubby to coordinate access to shared resources in a large-scale distributed system.

Sample clarifying questions!

Should the system support both read and write locks?
What happens if a client holding a lock crashes or disconnects?
Is lock expiration or lease renewal required?

Requirements#

Follow these requirements for the Google Chubby locking system:

System Design and workflow#

The Chubby cell is composed of multiple servers (usually five), all replicas of each other. One of these servers is a leader with whom the clients must communicate. Each server has a namespace that is composed of directories and files that contain data that is relevant to different applications. In addition to this namespace, the server contains an ACLAccess Control List: A list that tells which processes or people can be given what kind of access to computational resources. files directory to have access control lists of all the files and directories within the namespace. The Chubby library mediates communication between clients and servers in a Chubby cell. It takes a request from a client who wants to use the Chubby service and then finds the relevant cell, directs the request to that cell via remote procedure calls (RPCs), and then reports any changes made in the namespace, data, or metadata (also known as events) back to the client.

Knowledge test!

How does Chubby recover from server failures and network partitions while maintaining data consistency?
How does Chubby handle client failures and session timeouts?

Note: Look at the detailed design of Google Chubby locking to get answers to the above questions.

27. Design a coordination system like ZooKeeper#

Problem statement: Design a highly available, fault-tolerant, and scalable coordination system like ZooKeeper to manage configuration, naming, synchronization, and group services in a distributed system.

Sample clarifying questions!

What coordination features are needed: leader election, locking, or configuration?
What consistency level is required during network partitions?
Should clients be notified of changes (watch mechanism)?

Requirements#

Follow these requirements for the ZooKeeper system:

System Design and workflow#

The clients are the applications that use ZooKeeper as a coordination service for their application processes. ZooKeeper client library (API) provides functions such as create(), delete(), exists(), and many more to manage and use the coordination data. Through this API, the client request is forwarded to the ZooKeeper server. The ZooKeeper server represents a process that provides the ZooKeeper coordination service. It stores all the coordination data from different applications and their processes in memory. The namespace for applications/clients and their coordination data are organized in a hierarchy (in the form of a tree). The client application processes store their coordination data on znodes. These processes can perform all the operations provided in the ZooKeeper client API. Each znode can be accessed through its path in the standard UNIX notation (like having / for the root directory). There is a set of ZooKeeper servers called ZooKeeper Ensemble. All are replicas. One is elected as the leader, while others become the followers.

Knowledge test!

We have a collection of servers in the ZooKeeper ensemble. What should be the minimum number of servers, and why?

Note: If you need answers to such questions, look at the detailed design of ZooKeeper.

28. Design a scalable distributed storage system like Bigtable#

Problem statement: Design a massively scalable distributed storage system like Bigtable capable of handling petabytes of structured and unstructured data with low latency reads and writes, supporting flexible schema, efficient query patterns, and high availability while ensuring data consistency and durability.

Sample clarifying questions!

What is the expected workload pattern: read-heavy, write-heavy, or balanced?
Should the system offer strong consistency on read-after-write?
Is the schema flexible or strictly enforced?

Requirements#

Follow these requirements for the Bigtable system:

System Design and workflow#

The following illustration shows that the Bigtable implementation consists of three main parts: a library linked to each client, one Bigtable manager server, and several tablet servers. A library is a component that all clients share. This library enables clients to communicate with Bigtable. The manager server allocates tablets to table servers, identifies tablet server additions and expiration, regulates tablet-server traffic, and garbage collection of files in GFS (a distributed file system). It also supports schema changes like table and column family formation. All tablet servers are in charge of a certain group of tablets, generally around 10 to 1000 tablets. Each tablet server provides reads and writes of the data to the tablets to which it is allocated. Servers can be added or removed in a Bigtable cluster as needed. New tablets can be made and assigned, old ones can be merged, and they can be reassigned from one server to the other to accommodate changes in demand.

Knowledge test!

How does Bigtable efficiently support schema changes without impacting performance?
How does Bigtable ensure data distribution and replication across multiple servers?

Note: If you need answers to such questions, look at the detailed design of BigTable.

29. Design an online multiplayer game system#

Problem statement: Design an online multiplayer game system that allows players to connect and play in real time. The system should handle player matchmaking, maintain low latency communication, ensure player synchronization, and consistently manage game state.

Sample clarifying questions!

What game type is being built: real-time, turn-based, or battle royale?
What is the maximum number of concurrent players per session?
Should cross-platform play (mobile, console, desktop) be supported?

Requirements#

Follow these requirements for such a system:

System Design and workflow#

In an online multiplayer game system, players connect to the game server, which handles matchmaking by pairing players based on skill levels and preferences. Once matched, the server maintains low latency communication between players, ensuring smooth and real-time interactions using a pub/sub service. The game state, including player positions and actions, is synchronized across all players’ devices through a central game state manager. The session service manages sessions and synchronizes the players. The play service will handle all the game-related tasks like updating stats, checking players’ availability, etc. The payment service facilitates in-app purchases of assets.

For a better user experience, we can separate real-time operations, such as gameplay, from non-real-time operations, such as invites and in-app purchases.

Knowledge test!

How can you ensure system stability and prevent crashes when millions of users play simultaneously?
How would you implement lag compression and data buffering to handle network delays and ensure smooth gameplay?
What are the benefits of using a virtual private cloud (VPC)?
How can you maintain low latency for real-time communication, especially during peak usage?
How would you limit the number of requests to the server without compromising the real-time gaming experience?

Note: To learn more about gaming service design details, explore the gaming API design chapter.

30. Design a Zoom-like video conferencing system#

Problem statement: Design a real-time video conferencing system that supports high-quality meetings with hundreds of participants. The platform should offer interactive features like breakout rooms and polls, work reliably across different network conditions, and scale globally with low latency.

Sample clarifying questions!

What is the maximum number of participants per session?
Should video quality adapt based on bandwidth conditions?
Is end-to-end encryption required for audio, video, and chat?

Requirements#

Follow these requirements to design the system:

System Design and workflow#

This high-level design represents a video conferencing service that incorporates several components to provide a seamless experience for users. The system starts with the client, communicating with the API gateway to initiate requests. The API gateway handles authentication and directs the request to the load balancer, efficiently distributing traffic to the appropriate services, such as the user service, scheduling service, meeting service, and messaging service. These services manage user data, scheduling of meetings, real-time communication during meetings, and messaging functionalities. Additionally, the CDN ensures that video and media content is delivered with low latency to users across different geographical regions.

The media router (SFU) plays a critical role in managing media streams in real-time. It handles video and audio streams from multiple participants and forwards them to other participants without modifying the content, ensuring efficient bandwidth usage. The system also integrates a cloud processing service to handle more complex tasks like video processing or analytics. Data is stored in a blob store and a database to keep records of meetings, messages, and user information.

The following illustration shows a high-level design of a video conference service:

Knowledge test!

How does end-to-end encryption affect server-side features like recording or transcription?
How would you implement adaptive bitrate streaming to ensure smooth performance on weak networks?
What parts of the system will most likely break if a popular meeting gets thousands of participants simultaneously? How can you design to prevent that?

Note: Explore the Zoom API design to learn more about designing a video conference service and determine the answers to the above questions.

The final step of your interview prep#

Mastering these 30 questions is a fantastic first step toward comprehensive System Design interview preparation.

However, there are plenty more System Design concepts you’ll need to know for a real-world System Design interview. Educative has created an exhaustive course: Grokking Modern System Design Interview, which includes more detailed questions and answers and the opportunity to get hands-on practice.

This interactive course covers the building blocks of the modern System Design concept, coupled with more than a dozen real-world questions currently used in the industry. By the end of the course, you will understand what clarifying questions to ask and tradeoffs to make for each question. Ultimately, you will learn exactly what it takes to stand out to interviewers in the current hiring market.

That’s why if I had to pick just one System Design prep resource to give you, this would be it.

Quick tips to tackle System Design interview questions#

We all dream of passing the System Design interview with flying colors!

So why not make this dream a reality with some quick tips:

Practice structured thinking: Always start with a clear outline of your approach. Break problems down systematically, such as:

Requirements ➔ Components ➔ APIs ➔ Data models ➔ Bottlenecks ➔ Trade-offs

Clarify early, clarify often: Never rush into a design. Spend the first 5-10 minutes asking questions and defining system constraints and assumptions.
Prioritize communication: Think out loud. Walk your interviewer through your decisions, trade-offs, and reasoning, even if unsure.
Use diagrams: Visual aids like high-level architecture diagrams make your solution much easier to understand and show that you think like a true architect.
Review actual systems: Study the architectures of popular platforms like Instagram, Uber, Netflix, and Dropbox to understand real-world trade-offs.
Stay calm under pressure: Keep moving forward even if you get stuck. Composure, problem-solving attitude, and logical thought are often more important than reaching a “complete” design.

I wish you the best of luck with your interviews. I am confident that with a little hard work and strategic preparation, you will be successful.

Happy learning!

The following are some relevant courses that will help you to further your learning in the System Design and distributed systems domain:

Frequently Asked Questions

What is the purpose of a System Design interview?

The purpose is to evaluate a candidate’s ability to design scalable, efficient, and maintainable systems. It tests their problem-solving skills, understanding of architecture, and ability to communicate complex ideas.

Is the System Design interview hard?

Yes, System Design interviews can be hard. The difficulty level varies based on the engineering role, with junior engineers facing more straightforward problems, while senior, staff, and principal engineers encounter more complex scenarios requiring in-depth architectural knowledge and scalability considerations.

What is the role of System Design in software engineering?

System Design in software engineering involves defining a system’s architecture, components, modules, interfaces, and data to satisfy specified requirements. Ensuring the system is scalable, reliable, maintainable, and performant is crucial. System Design helps identify the best strategies for load balancing, data storage, and handling concurrent user requests. It also plays a key role in facilitating communication between different system parts and ensuring they work together seamlessly. This process is vital for building complex software applications and services that can efficiently handle large-scale operations and adapt to changing needs.

Does System Design require coding?

No, System Design does not necessarily require coding. It focuses more on creating a high-level architecture and understanding how components interact and scale, although having coding knowledge can help understand practical implementation details.

How do I answer System Design interview questions?

Here’s how you can excel at System Design interview questions:

Understand what interviewers are expecting from you.
Follow this five-step framework for system design interviews:
1. Start by clearly defining the problem you’re addressing.
2. Sketch out a high-level design of the system.
3. Go deeper into the specifics of your design.
4. Identify any potential bottlenecks and discuss scalability.
5. Wrap up by reviewing your design and summarizing the key points.
Watch out for common mistakes to avoid during the process.

How long are System Design interviews?

System Design interviews typically last between 45 to 60 minutes. During this time, candidates must understand the problem, ask clarifying questions, outline a high-level design, delve into detailed components, and discuss trade-offs and scalability considerations. The aim is to evaluate candidates’ ability to design scalable and efficient systems while communicating their thoughts.

Which companies ask System Design interview questions?

Tech companies commonly ask System Design interview questions, especially those focused on software engineering and development roles. Some big tech companies include Meta, Amazon, Apple, Netflix, Google (MAANG), and Microsoft. These questions help assess candidates’ ability to design scalable, reliable, and efficient software systems.

How can I prepare for System Design interview questions?

Preparing for System Design interviews is challenging, but a structured approach can make it manageable. Start by reviewing the fundamental concepts of System Design, including scalability, reliability, availability, consistency, and other distributed systems concepts. Understanding the role of core components like databases, distributed storage, load balancers, and caching mechanisms. Once familiar with these, study various architectural patterns, such as microservices, event-driven architecture, and pub/sub models. Additionally, practice designing systems by working through real-world scenarios and identifying trade-offs to demonstrate your ability to make informed decisions during interviews.

What are some common mistakes to avoid in System Design interviews?

Avoid diving into details too early, not clarifying requirements, ignoring scalability and reliability aspects, and not considering trade-offs. Make sure to communicate your thought process.

Step	Goal
Requirements	Understand the problem
Estimation	Quantify scale
System interface	Define interactions
High-level design	Build architecture
API and data model	Define data flow
Deep dive	Solve hard problems
Evaluate trade-offs	Show engineering judgment
Discuss improvements	Demonstrate senior thinking

User	user_id, name, email
Post	post_id, author_id, content
Message	sender_id, receiver_id, timestamp
Video	video_id, owner_id, storage_url

SQL vs NoSQL	Consistency vs scalability
Push vs Pull	Real-time updates vs simplicity
Cache-heavy design	Performance vs complexity
Strong consistency	Correctness vs latency

Requirements	5 minutes
Estimation	5 minutes
High-level design	10 minutes
Deep dive	15 minutes
Trade-offs and improvements	10 minutes

Top 30 System Design Interview Questions in 2026

How to answer any System Design interview question#

The RESHADED framework#

R → Requirements#

E → Estimation#

S → System interface#

H → High-level design#

A → API and data model#

D → Deep dive#

E → Evaluate trade-offs#

D → Discuss improvements#

Mini example: Design a URL shortener#

Requirements#

Estimation#

System interface#

High-level design#

API and data model#

Deep dive#

Trade-offs#

Common System Design interview mistakes#

Time allocation for a 45-minute interview#

Top 30 System Design interview questions#

Easy System Design interview questions#

Medium System Design interview questions

Hard System Design interview questions

Tips for any SDI question#

System Design interview cheat sheet#

Easy System Design interview questions#

1. Design an API rate limiter for sites like Firebase or GitHub#

Requirements#

System Design and workflow#

2. Design a pub/sub system like Kafka#

Requirements#

System Design and workflow#

3. Design a URL-shortening service like TinyURL or bit.ly#

Requirements#

System Design and workflow#

4. Design a scalable content delivery network (CDN)#

Requirements#

System Design and workflow#

5. Design a web crawler#

Requirements#

System Design and workflow#

6. Design a distributed cache#

Requirements#

System Design and workflow#

7. Design an authentication and SSO platform like Auth0#

Requirements#

System Design and workflow#

Medium System Design interview questions#

8. Design a video-first social platform like TikTok#

Requirements#

System Design and workflow#

9. Design an AI-powered customer support platform#

Requirements#

System Design and workflow#

10. Design a chat service like Facebook Messenger or WhatsApp#

Requirements#

System Design and workflow#

11. Design a mass social media service like Facebook or Instagram#

Requirements#

System Design and workflow#

12. Design a proximity service like Yelp or nearby places/friends#

Requirements#

System Design and workflow#

13. Design a search engine-related service like Typeahead#

Requirements#

System Design and workflow#

14. Design a video streaming service like YouTube or Netflix#

Requirements#

System Design and workflow#

15. Design a ride sharing service like Uber or Lyft#

Requirements#

System Design and workflow#

16. Design a recommendation service#

Requirements#

System Design and workflow#

17. Design a file sharing service like Google Drive#

Requirements#

System Design and workflow#