An Introduction to Application Scaling

Get familiar with the basics of application scaling in Node.js.

Scalability can be described as the capability of a system to grow and adapt to the ever-changing conditions. Scalability isn’t limited to pure technical growth; it’s also dependent on the growth of a business and the organization behind it.

If we expect our product to reach millions of users worldwide rapidly, we’ll face serious scalability challenges. How is our application going to sustain the ever-increasing demand? Is the system going to get slower over time or crash often? How can we store high volumes of data and keep I/O under control? As more people are hired, how can we organize the different teams effectively and make them able to work autonomously, without contention across the different parts of the codebase?

Even if we’re not working on a high-scale project, that doesn’t mean that we’ll be free from scalability concerns. We’ll just face different types of scalability challenges. Being unprepared for these challenges might seriously hinder the success of the project and ultimately damage the company behind it. It’s important to approach scalability in the context of the specific project and understand the expectations for current and future business needs.

Because scalability is such a broad topic, we’ll focus our attention on discussing the role of Node.js in the context of scalability. We’ll discuss several useful patterns and architectures used to scale Node.js applications.

With these patterns and architectures in our toolbelt and a solid understanding of our business context, we’ll be able to design and implement Node.js applications that can adapt and satisfy our business needs and keep our customers happy.

Scaling Node.js applications

We already know that most of the workload of a typical Node.js application runs in the context of a single thread. This isn’t necessarily a limitation but rather an advantage, because it allows the application to optimize the usage of the resources necessary to handle concurrent requests, thanks to the non-blocking I/O paradigm. This model works wonderfully for applications handling a moderate number of requests per second (usually a few hundred per second), especially if the application is mostly performing I/O-bound tasks (for example, reading and writing from the filesystem and the network) rather than CPU-bound ones (for example, number crunching and data processing).

In any case, assuming we’re using commodity hardware, the capacity that a single thread can support is limited. This is regardless of how powerful a server can be, so if we want to use Node.js for high-load applications, the only way is to scale it across multiple processes and machines.

However, workload isn’t the only reason to scale a Node.js application. In fact, with the same techniques that allow us to scale workloads, we can obtain other desirable properties such as high availability and tolerance to failures. Scalability is also a concept applicable to the size and complexity of an application. In fact, building architectures that can grow as much as needed over time is another important factor when designing software.

JavaScript is a tool to be used with caution. The lack of type checking and its many gotchas can be an obstacle to the growth of an application, but with discipline and accurate design, we can turn some of its downsides into precious advantages. With JavaScript, we’re often pushed to keep the application simple and to split its components into small, manageable pieces. This mindset can make it easier to build applications that are distributed and scalable, but also easy to evolve over time.

The three dimensions of scalability

When talking about scalability, the first fundamental principle to understand is load distribution, which is the science of splitting the load of an application across several processes and machines. There are many ways to achieve this, and the book The Art of Scalability by Martin L. Abbott and Michael T. Fisher proposes an ingenious model to represent them, called the scale cube. This model describes scalability in terms of the following three dimensions:

  • X-axis—Cloning

  • Y-axis—Decomposing by service/functionality

  • Z-axis—Splitting by data partition

These three dimensions can be represented as a cube, as shown in the illustration below.

Get hands-on with 1200+ tech skills courses.