Search⌘ K
AI Features

Scaling Strategies

Explore how vertical and horizontal scaling strategies address growth and load in system design. Understand the trade-offs in capacity, complexity, and resilience. Learn when to use each approach, challenges in distributed scaling, and how auto-scaling optimizes resources dynamically. This lesson prepares you to design scalable, reliable systems that meet performance and availability demands.

Scaling the real-time communication patterns discussed in the previous lesson, where SSE streams and WebSocket connections maintain persistent links between client and server, is precisely the scenario where scaling strategy selection becomes critical.

Scaling strategies are the architectural response to growth pressure, and they break down into two fundamental approaches: vertical scaling (scaling up a single machine) and horizontal scaling (scaling out across multiple machines). These two approaches form the core framework for every capacity decision explored in this lesson.

Vertical vs. horizontal scaling

Vertical scaling adds more CPU, RAM, or storage to a single machine. A database server running low on memory gets upgraded from 64 GB to 256 GB, and the application continues without code changes or distributed coordination overhead. This simplicity is its greatest advantage. However, vertical scaling hits a hard ceiling imposed by the largest available hardware, and it concentrates all traffic on a single point of failure. If that machine goes down, the entire service goes with it.

Adding CPU, RAM to the server rack
Adding CPU, RAM to the server rack

...