Non-Functional Requirements for System Design Interviews

Learn why non-functional requirements matter in System Design interviews, and discover a few often-overlooked best practices for designing systems that scale well.

We'll cover the following...

Common non-functional requirements
Acing NFRs: Google Maps and YouTube
- Design Google Maps
- Design YouTube
Quick tips for NFR interview questions
Conclusion

Engineers often meet functional requirements easily but struggle to achieve scalability and low latency simultaneously. This lesson covers essential strategies for meeting NFRs in your designs.

Note: This lesson focuses on achieving them.

Common non-functional requirements

Interviewers focus on specific NFRs. We will address:

Performance
Availability
Scalability

1) Performance

Performance measures a system’s ability to respond to requests and process data efficiently. For example, in a messaging service, an interviewer might ask: How do you deliver messages with low latencyLow latency refers to minimal time delays? To achieve this, you might select an efficient two-way protocol like WebSocket.

Approaches to achieve performance

Caching: Caching stores frequently accessed data, reducing repeated computations and user-perceived latency.

Consider an X (formerly Twitter)-like system with a service dedicated to generating the timelineThe timeline includes a stream of posts and various recommendations based on the user's interests and followers' activity (such as their posts, reposts, likes, etc.)..

Does the service generate a timeline for every follower when a celebrity posts? With millions of followers, this would severely degrade performance.

To address this, first divide followers into active usersActive users are those who frequently use their X accounts. and inactive usersInactive users are those who used their accounts a long time ago (say, more than three months).. Generate timelines for inactive users on demand. For active users, introduce a feed cacheIt is a distributed cache like Redis for storing the timeline of active users.. This cache prepopulates the timeline. When active users request their feed, the service retrieves it immediately from the cache, ensuring minimal latency.

2) Availability

Availability measures system uptime and accessibility. 99.999% uptime (less than 6 minutes of downtime per year) is the gold standard but difficult to achieve. High availability is critical for retention; for example, downtime on an e-commerce site directly loses sales.

Approaches to achieve availability

Redundancy: Replicate key components and data across multiple servers and data centers. If one server fails, a load balancer reroutes requests to a backup, eliminating single points of failure.

Automatic scaling: Dynamically adjusts resources based on traffic spikes using cloud techniques like auto scaling.

Sharding: Splits a database into smaller shards to distribute data load across servers. Common techniques include key-rangeIt distributes data based on specific ranges of keys. and hash-basedThis distributes the data by applying a hash function to the keys, ensuring even distribution across shards. sharding.

Modular design: Decomposes the system into independent services. Each service scales independently based on demand.

Strategies to meet YouTube NFRs:

Minimal response times: Use caching servers at ISP and CDN levels to deliver popular content quickly. Optimize storage by using Bigtable for thumbnails and Blob storage for videos. A lightweight web server (e.g., Lighttpd) efficiently handles video uploads.
Reliability: Use data sharding to isolate failures. Replicate critical components for fault tolerance and use heartbeat messagesA node in a distributed system sends a regularly spaced message indicating it's healthy and active. If the node fails to send heartbeat messages, other nodes can assume that the node has failed. to detect and remove faulty servers.

Quick tips for NFR interview questions

Proactively clarify NFRs during the interview. Ask about:
- Expected user traffic
- Expected data load
- Expected downtime tolerance
Evaluate trade-offs between techniques, considering complexity, cost, and maintainability.
Prepare solutions for common patterns:
- Transactions: Choose ACID-compliant relational databases.
- Large-scale data: Use NoSQL databases (MongoDB, Cassandra) for scalability.
- Real-time data: Use streaming platforms like Apache Kafka or Amazon Kinesis.

There is no one-size-fits-all solution. Success depends on asking clarifying questions, prioritizing NFRs, and justifying your trade-offs.

Conclusion

This lesson covered the role of non-functional requirements in System Design. Understanding common NFRs and how to address them helps you answer System Design interview questions more effectively.

Nonfunctional requirements	Strategies
Availability	Divide the road network graph into small graphs (segments) to process user queries. Replicate the small segment servers. Request load balancing across different segment servers.
Scalability	Partition the large graphs into smaller graphs to ease segment addition. Host the graphs on different servers to handle increased number of queries per second.

Nonfunctional Requirements	Strategies
Less response time	Cache at different layers CDNs Choose appropriate storage systems (e.g., blob storage to store videos, Bigtable to store thumbnails) Serve videos and static content with Lighttpd
Reliability	Data sharding Replicate critical components Heartbeat protocol

Non-Functional Requirements for System Design Interviews

Common non-functional requirements

1) Performance

Approaches to achieve performance

2) Availability

Approaches to achieve availability

3) Scalability

Approaches to achieve scalability

Acing NFRs: Google Maps and YouTube

Design Google Maps

Design YouTube

Quick tips for NFR interview questions

Conclusion