...

/

Fault Tolerance

Fault Tolerance

Learn about fault tolerance in distributed systems.

Nobody should transform their perfectly working simple system into a distributed one without convincing reasons. There has to be a thorough discussion and evaluation before finally deciding to go for a distributed system.

On the other hand, building a distributed system correctly does provide us with some advantages which we would not have in a simple system.

In this chapter, we’ll explore the core goals that you need to keep in mind to build a distributed system the correct way.

Let’s start with fault tolerance.

Fault tolerance in distributed systems

Anything that can go wrong, will go wrong.

Edward A. Murphy Jr. (1918-1990)

In the world of distributed systems, Murphy’s Law is not just a saying, it’s a fact.

Things go wrong in all kinds of ways.

To build a robust system, your system needs to be able to handle adverse scenarios. And a robust system like this does not come easily—or for free. It requires energy, effort, ...