Introduction to Scaling

Get an introduction to scaling.

What is scaling?

When we talk about scaling Python, what we mean is making a Python application scalable. However, what is scalability?

According to Wikipedia:

“Scalability is the capability of a system, network, or process to handle a growing amount of work, or its potential to be enlarged to accommodate that growth.” - Wikipedia

This definition makes scalability difficult to define as an absolute since no definition applies to all applications.

This course concentrates on methods, technologies, and practices that allow one to make applications fast and able to grow in order to handle more jobs – all of that using the Python programming language and its major implementation, CPython.

Scaling

Distributed applications

We are all aware that processors are not becoming fast at a rate where a single threaded application could, one day, be fast enough to handle any size workload. That means we need to think about using more than just one processor. Building scalable applications implies that you distribute the workload across multiple workers using multiple processing units.

Those workers divide up tasks across several processors, and in some cases, across several computers.

That is a distributed application.

Types of applications

There are fundamental properties to understand about distributed systems before digging into how to build them in Python or any other language.

We can consider the following options when writing an application:

Single-threaded application

  • Write a single-threaded application. This should be your first pick, and indeed it implies no distribution. They are the simplest of all applications. They are easy to understand and therefore easier to maintain. However, they are limited by the power of using a single processor.
Single-threaded program
Single-threaded program
Multi-threaded program
Multi-threaded program

Multi-threaded application

  • Write a multi-threaded application. Most computers, even your smartphone, are now equipped with multiple processing units. If an application can overload an entire CPU, it needs to spread its workload over other processors by spawning new threads (or new processes). Multi-threading applications are more error-prone than single-threaded applications, but they offer fewer failure scenarios than multi nodes applications, as no network is involved.

Network distributed application

  • Write network distributed applications. This is your last resort when your application needs to scale significantly, and not even one big computer with plenty of CPUs is enough. These are the most complicated applications to write because they use a network. This means they risk a variety of scenarios, such as a total or partial failure of a node or the network, high latency, messages being lost, and any other failure scenarios related to the unreliability of networks.
Distributed application

The properties of distribution vary widely depending on the type you pick. Operations on a single processor can be regarded as fast, with low latency while being reliable, and ordered, whereas operations across several nodes should be considered slow with high latency. They are often unreliable and unordered.

Consider each architecture choice or change carefully. As seen throughout this course, there are various tools and methods in Python available for dealing with any of those choices. They help to build distributed systems, and therefore, scalable applications.