Note: This post was originally published in 2020 and has been updated as of Nov. 15, 2021.
Understanding system design is more important than ever in today’s tech industry. The larger an application gets, the more reliable and efficient architectural patterns need to be. System design is likely a requirement to land your dream software engineering job.
Today, we’ll apply system design principles by approaching two common system design interview questions: designing TinyURL and Instagram. You’re in the right place if you’re here to prepare for a system design interview.
Our comprehensive system design course covers these design problems – and much more.
TinyURL is a URL shortening web service that creates shorter aliases for long URLs. Users select these shortened URLs and are redirected to the original URL. This service is useful because short links save space and allow users to easily type abbreviated URLs instead of long ones.
Let’s consider some functional and non-functional requirements for designing TinyURL.
Non-functional requirements:
The system must be highly available. The short links will not function if the service fails.
URL redirection should happen in real-time with minimal latency.
Shortened links should not be predictable in any manner.
Functional requirements:
Our service will generate a shorter alias of the original URL that can easily be copied and pasted.
The short link should redirect users to the original link.
Users should have the option to pick a custom short link for their URL.
Short links will expire after a default timespan, which users can specify.
Our system will be read-heavy. There will be a large number of redirection requests compared to new URL shortenings. Let’s assume a 100:1 read/write ratio. Our reads are redirection requests and writes are new URL shortenings.
Traffic estimates: Let’s assume we have 500 million new URL shortenings per month. Based on our 100:1 read/write ratio, we can expect 50 billion redirections during the same period:
We’ll now determine our system’s Queries Per Second (QPS). We’ll take the monthly amount of 50 billion to calculate the new URLs shortenings per second:
Applying the 100:1 read/write ratio again, URL redirections per second will be:
Storage estimates: Let’s assume we store every URL shortening request and its associated link for five years. Since we expect to have 500 million new URLs every month, the total number of objects we expect to store over five years will be 30 billion:
For now, let’s assume that each stored object will be approximately 500 bytes. This means we’ll need 15TB of total storage for the five year period:
Bandwidth estimates: We expect 200 new URLs per second for write requests. This makes our service’s total incoming data 100KB per second:
For read requests, we expect approximately 20,000 URL redirections every second. Then total outgoing data for our service would be 10MB per second:
Memory estimates: We’ll need to determine how much memory we’ll need to store the frequently accessed hot URLs’ cache. If we follow the 80-20 rule, then 20% of URLs will generate 80% of traffic. We would like to cache this 20% of hot URLs.
Since we have 20,000 requests per second, we will be getting 1.7 billion requests per day:
To cache 20% of these requests, we will need 170GB of memory.
It’s worth noting that there will be a lot of duplicate requests for the same URL. This means our actual memory usage will be less than 170GB.
We can use REST APIs to expose our service’s functionality. This example shows the API’s definition for creating and deleting URLs without service:
createURL(api_dev_key, original_url, custom_alias=None, user_name=None, expire_date=None)
Parameters:
api_dev_key
(string): A registered account’s API developer key. This will be used to throttle users based on their allocated quota.
original_url
(string): Original URL to be shortened.
custom_alias
(string): Optional custom key for the URL.
user_name
(string): Optional user name to be used in the encoding.
expire_date
(string): Optional expiration date for the shortened URL.
Returns (string): A successful insertion returns the shortened URL. Otherwise, it returns an error code.
deleteURL(api_dev_key, url_key)
The url_key
is a string that delineates the shortened URL that needs to be deleted. A successful deletion will return URL Removed
.
In real TinyURL and Instagram system design interviews, you’ll be asked how to prevent spam and malicious use. For TinyURL, add:
Key generation trade-offs
Our comprehensive system design course covers these design problems – and much more.
Let’s look at some observations about the data we are storing:
The service needs to store billions of records.
The service is read-heavy.
Each object is small (<1,000).
There are no relationships between each record, except for storing which user created the short link.
Schema:
We’ll now consider database schemas. We need a table for storing information on URL mappings as well as a database for data on users who created short links.
What type of database do we use?
The best choice would be to use a NoSQL database store like DynamoDB or Cassandra since we are storing billions of rows with no relationships between the objects.
Our service’s most crucial concern is how we’ll generate a short and unique key when given a URL. For this example, we’ll go about this by encoding the URL.
We can compute a given URL’s unique hash, such as MD5 or SHA256. The hash can then be encoded for display. This encoding could be base36, base62, or base64.
We’ll use base64 encoding. Now we’ll consider whether the short key should be six, eight, or 10 characters long.
Using base64 encoding, a six-character key would yield 64^6 = ~68.7 billion possible strings.
Using base64 encoding, an eight-character key would result in 64^8 = ~281 trillion possible strings
Let’s assume the six-character key is sufficient for our system. Our hash function will produce a 128-bit hash value if we use the MD5 algorithm. Each base64 character encodes six bits of the hash value. This means we’ll get a string of over 21 characters after encoding. We’ll need to take a different approach for choosing our key because we only have space for 8 characters per short key.
We can take the first six (or eight) letters for the key. We’ll need to take steps to avoid key duplication. We might swap some characters or choose characters not already in the encoding string.
There are some potential obstacles when taking this approach:
If multiple users enter the same URL, they will get the same short link.
Parts of the URL can be URL-encoded.
Workarounds: One solution to the aforementioned problems would be appending a number from a sequence to each short link URL. This would make each link unique, even if multiple users provide the same URL. Another potential solution is to append the user ID to the input URL. However, this would only work if the user was signed in. We would also need to generate a uniqueness key.
Learn how to design generative AI systems!
This course will prepare you to design generative AI systems with a practical and structured approach. You will begin by exploring the foundational concepts, such as neural networks, transformers, tokenization, embedding, etc. This course introduces a 6-step SCALED framework, a systematic approach to designing robust GenAI systems. Next, through real-world case studies, you will immerse into the design of GenAI systems like text-to-text (e.g., ChatGPT), text-to-image (e.g., Stable Diffusion), text-to-speech (e.g., ElevenLabs), and text-to-video (e.g., SORA). This course describes these systems from a user-focused perspective, emphasizing how user inputs interact with backend processes. Whether you are an ML/software engineer, AI enthusiast, or manager, this course will equip you to design, train, and deploy generative AI models for various use cases. You will gain confidence to approach new challenges in GenAI and leverage advanced techniques to create impactful solutions.
Our database will store information about billions of URLs. We’ll need to partition it to make it scalable.
A
in one partition, and so on.This approach is problematic because it leads to unbalanced database servers. This in turn will create unequal load.
This approach sometimes leads to overloaded partitions. If it does, you could use consistent hashing to resolve it.
Our service should be able to cache URLs that are frequently accessed. We could use a solution like Memcached to store full URLs with respective hashes.
Cache memory requirements: We can start with around 20% of the daily traffic and adjust based on usage patterns. Our previous estimations tell us we’ll need 170GB of memory to cache 20% of the daily traffic.
Cache eviction policy: We want to replace a link with a more popular URL. We can use the Least Recently Used (LRU) policy for our system.
We can add a load balancing layer to our system in three places:
Between clients and application servers
Between application servers and database servers
Between application servers and cache servers
We can start with a Round Robin approach to equally distribute incoming requests among servers. This is easy to implement and does not create any overhead.
It’s important to note that the load balancer won’t stop sending new requests to servers that become overloaded when using this approach. To avoid overload, a more intelligent load balancer solution can be developed to periodically adjust traffic based on load per server.
Instagram is a social media platform that allows users to share photos and videos with other users. Instagram users can share information either publicly or privately.
We’ll design a simple version of Instagram that enables users to share photos, follow each other, and access the news feed. Users will see the top photos of the people they follow on their news feed.
Let’s consider some functional and non-functional requirements for designing Instagram.
Non-functional requirements:
The service should be highly available.
The system’s acceptable latency should be around 200 milliseconds for the news feed.
The system should so reliable that photos and videos in the system are never lost.
Functional requirements:
The user should be able to search based on photo or video titles.
The user should be able to upload, download, and view photos and videos.
The user should be able to follow other users.
The system should be able to generate a displayed news feed consisting of the top photos and videos of the people that the user follows.
Functions that are not within the scope of this project are adding tags to photos, searching for photos with tags, commenting on photos, and tagging users.
Our comprehensive system design course covers these design problems – and much more.
Grokking Modern System Design for Software Engineers & Managers: Design Instagram
We will base our approach off these numbers:
500 million total users
1 million daily active users
2 million new photos every day (at a rate of 23 new photos/second)
Average photo file size of 200KB
Total space required for 1 day of photos:
The system should be able to support users as they upload and view each other’s media. This means our service needs servers to store media as well as another database server to store the media’s metadata.
In the TinyURL and Instagram system design question, the feed is the crux:
Fanout-on-write: when Alice posts, enqueue to all followers’ inboxes (Timeline Store). Read is fast, write scales poorly for celebrities.
Fanout-on-read: compute on demand at read time (pull recent posts from followed users, merge/sort). Writing is cheap, reading is expensive.
Hybrid: pre-fan out to low/medium-degree followers; fetch on read for high-degree fans of celebrities. This reduces write storms.
Use a wide-column store (e.g., Cassandra) keyed by UserID with columns = post IDs sorted by TimeUUID. Keep two timelines:
Home (inbox): precomputed fanouts (best for fast scroll).
Unified: for backfill/edge cases (celeb posts, late arrivals).
A ranker (lightweight ML or heuristics) scores posts using recency, interactions, relationship strength, content type, and freshness. Serve top-K plus a tail of recent items. Keep feature store counters (likes, comments) eventually consistent.
We could take a straightforward approach by storing the above schema in a Relational Database Management System (RDBMS). However, challenges often arise when using a relational database for a scaling application.
We can store the above schema with key-value pairs utilizing a NoSQL database. The metadata for the photos and videos will belong to a table where the key
would be the PhotoID
, and the value
would be an object containing PhotoLocation
, UserLocation
, CreationTimestamp
, and so on.
We can use a wide-column datastore, Cassandra, to store the relationships between users and photos as well as a list of a user’s followed accounts. The UserPhoto
table’s key
would be UserID
. The value
would be the list of PhotoIDs
the user owns. These will be stored in different columns. This pattern will be similar to the UserFollow
table.
The photos and videos can be stored in a distributed file storage like HDFS.
Let’s estimate how much data will go into each table and how much total storage we’ll need for 10 years.
User: Each row in this table will be 68 bytes, assuming each int
and dateTime
is four bytes:
If we have 500 million users, we’ll need 32GB of total storage:
Photo: Each row in this table will be 284 bytes:
If 2 million new photos get uploaded every day, we’ll need 0.5GB of storage for one day:
For 10 years we will need 1.88TB of storage.
UserFollow: Each row in this table will be eight bytes. We have 500 million users. If each user follows an average of 500 other users, we’ll need 1.82TB of storage for the UserFollow table:
Total space required for all tables for 10 years will be 3.7TB:
Photo uploads are often slower than reads because they go to the disk. Uploading users will consume all the available connections because of how slow the process is. Reads cannot be served when the system is loaded with write requests. To handle this bottleneck, we can split reads and writes to separate servers so that the system is not overloaded.
This will allow us to optimize each operation efficiently.
Creating redundancy in the system allows us to create a backup in the midst of a system failure. We can’t lose any files and need a highly reliable application.
We’ll achieve this by storing multiple copies of each photo and video so the system can retrieve media from a copy server in the even that a server dies. We’ll apply this design to the other components of our architecture. With multiple copies of services in the system, the system will run even if a service dies.
One possible scheme for a metadata sharding service is to partition based on PhotoID.
To solve the above problems, we can generate unique PhotoIDs and then find a shard number through PhotoID % 10
. We would not need to append ShardID with PhotoID in this case because PhotoID will be unique throughout the system.
We won’t be able to define PhotoID by having an auto-incrementing sequence in each shard. Wee need to know PhotoID in order to find the shard where it will be stored. One solution could be to dedicate a separate database instance to generate auto-incrementing IDs.
If our PhotoID can fit into 64 bits, we can define a table containing only a 64 bit ID field. Whenever we want to add a photo in our system, we can insert a new row in this table and take that ID as the PhotoID of our new photo.
This key-generating database could be a single point of failure. A workaround for that could be defining two such databases with one generating even-numbered IDs and the other odd-numbered. For MySQL, the following script can define such sequences:
KeyGeneratingServer1:auto-increment-increment = 2auto-increment-offset = 1KeyGeneratingServer2:auto-increment-increment = 2auto-increment-offset = 2
We can put a load balancer in front of both of these databases. We can Round Robin between them to deal with downtime. One of these servers could generate more keys than the other. Luckily, if they are out of sync in this way, it won’t cause an issue for our system. We can extend this design by defining separate ID tables for Users, Photo-Comments, or other objects present in our system.
Indexing: push metadata (caption/title/hashtags) into Elasticsearch/OpenSearch with analyzers and time-based indices.
Secondary indexes: user → posts, hashtag → posts using wide-column stores; maintain via CDC (change data capture) or write-side dual-writes + idempotent workers.
Celebrity posts can melt caches:
Split counters by shard keys (e.g., 128 buckets per post) and sum on read.
Pre-warm CDN edges for trending posts; keep write-light/read-heavy endpoints cacheable.
Use write-through cache for timelines and read-through with stale-while-revalidate for profiles.
Eventual consistency for likes/views counters and follower counts.
Stronger consistency for actions that gate UX (e.g., follow/unfollow acknowledgment), achieved with single-home writes or lightweight transactions.
The service would need a large-scale photo delivery system to serve data to users globally. We could push the content closer to users using cache servers that are geographically distributed.
Well done! You should now have a good idea of how to design an application like Instagram and TinyURL. There are still many more system design patterns to learn.
To help you master system design for your interview and beyond, we’ve created Grokking Modern System Design for Software Engineers & Managers. This course dives deep into other design problems, such as Facebook Messenger, Twitter, and more. But going beyond your interview, it’s a comprehensive course covering the fundamentals system design for developers and engineering managers alike.
Happy learning!