Fault Tolerance
Learn about fault tolerance in distributed systems.
Nobody should transform their perfectly working simple system into a distributed one without convincing reasons. There has to be a thorough discussion and evaluation before finally deciding to go for a distributed system.
On the other hand, building a distributed system correctly does provide us with some advantages which we would not have in a simple system.
In this chapter, we’ll explore the core goals that you need to keep in mind to build a distributed system the correct way.
Let’s start with fault tolerance.
Fault tolerance in distributed systems
Anything that can go wrong, will go wrong.
Edward A. Murphy Jr. (1918-1990)
In the world of distributed systems, Murphy’s Law is not just a saying, it’s a fact.
Things go wrong in all kinds of ways.
To build a robust system, your system needs to be able to handle adverse scenarios. And a robust system like this does not come easily—or for free. It requires energy, effort, ...