Scalability
Discover how to achieve system scalability to handle increasing workloads without performance degradation. Compare vertical and horizontal scaling strategies, and apply techniques such as load balancing, caching, and sharding. Learn to make informed System Design decisions that balance cost, complexity, and growth.
What is scalability?
Scalability is a system’s ability to handle increasing workload without degrading latency, throughput, or reliability. For example, a search engine must support more concurrent users and larger indexes while keeping query latency low. A scalable system can increase capacity to meet demand without a noticeable drop in responsiveness or availability.
Consider another example where early Twitter often crashed, displaying the “fail whale.” This was primarily a scalability issue. The system could not handle rapid user growth and write-heavy traffic. Scalability allows a system to absorb traffic spikes without excessive latency or downtime. Without it, response times increase, and outages become more likely.