What is a distributed system–take 1

That’s easy to answer.

A distributed system is a system that is distributed.

Simple!

But we should really avoid defining a term using the term itself.

Let’s try again.

What is a distributed system–take 2

You are likely familiar with the term ‘system’. Think of your laptop. That’s a system containing CPUs, RAMs, disk drives, numerous chips, etc.

In this context, a computer is a system, a system that is capable of storing information and also crunching numbers.

On the other hand, “distributed” means scattered, divided, or dispersed .

So when you hear the term “distributed systems”, you can safely assume that the term involves multiple machines, possibly hundreds or thousands of them.

But hang on, this is still not a definition. Let’s try this one more time.

What is a distributed system–take 3

This time, let’s first think of a system that isn’t distributed.

A single system is not a distributed system. Your single machine operating solely on its own is not a distributed system.

But hang on, don’t forget that your machine has multiple cores. All those cores may run distinct programs in parallel.

And those programs are perhaps communicating with one another. Does that mean all those cores of your machine create a distributed system?

Well, technically, yes. But this is not valid in the context of today’s reality. Rather, multiple machines have to be used for a system to be called ‘distributed’.

Before giving more of a formal definition, there is a distributed system that exists everywhere around you, and which you definitely use every day. This is so obvious that you might actually never notice it. Can you think of what it could be? We will discuss the answer in the following lesson.

Now let’s go for a formal definition:

A distributed system is a collection of autonomous computing elements that appears to its users as a single coherent system.

Distributed Systems, Steen & Tanenbum

There are two very important aspects to this definition. Let’s discuss them.

A distributed system is a collection of autonomous computing elements

A computing element is an entity that can compute stuff. Basically, your computer, or a core of your computer, or a single process running on a core of your machine can all be thought of as a computing element.

But as mentioned above, in our context, our focus will be more on computing machines instead of core or process level elements. So essentially, a computing element is a physical machine, and we will refer to it as a node.

These nodes are capable of running and dying independently of each other. But that doesn’t mean we can just run these nodes, and they’ll achieve our purpose. The nodes will have to be programmed so that they interact with each other and serve the users of the system. Here, a user can be humans or even other systems.

So, in short, in a distributed system, there will be nodes that will communicate and interact with one another. Through this interaction, they will achieve a common goal. And the nodes will be programmed in such a way that they can perform their duties autonomously.

A distributed system feels like a single coherent system

This may not be very obvious.

As a user, when you interact with a system, for example, your Instagram feed, you don’t really think of the many machines in the background running 24/7 to power your home page. To us users, a distributed system performs in a way that it is very hard for us to distinguish whether there is only one node doing all the computations for us or if there are numerous nodes crunching numbers behind the curtain.

The above essentially means that a distributed system should be coherent throughout its operation. Users should see what they expect.

Intuitively, we can say that a system like Instagram is never running on a single node. To be honest, it is entirely possible that the people working at Instagram don’t even know how many nodes their system is running on.

When you open Instagram, you expect to see the photos that the accounts you follow have recently shared. You also expect to be able to scroll through your homepage seamlessly. Then you may want to visit your profile, or check your direct messages, or perhaps upload a new photo to share with your friends.

All the above navigations on the Instagram app feel smooth. That is what distributed systems aim to do. As a user, you will see that everything is consistent—every part of the application is doing what you expect it to do.

But at the same time, sometimes things break. One fine morning, you may see that your feed is not loading on the Instagram app, but that you can still upload a new photo. This exposes that part of the Instagram system is running on one set of nodes, and another part is probably running independently on a different set of nodes. This is what we call a state of incoherence.

A distributed system aims at avoiding such inconsistent behavior. It wants to be coherent. And, of course, this is a difficult task by nature.

Key takeaways

To summarize, a distributed system is a system where:

  • A bunch of nodes run programmatically.
  • The nodes have a common goal to achieve, like serving users with some data (basically, all systems do this in one way or another).
  • The nodes behave coherently, meaning that users should see what they expect.