...

/

Did this Node Time-Travel into the Past?

Did this Node Time-Travel into the Past?

Time is tricky even within a single node. It's even more tricky in Distributed Systems. Let's see why

Let’s debug a Process Crash

By the way:

If someone hasn’t told you already, I have news for you. There is usually no Debugger in a Distributed System when you are investigating an issue. At most, you’ll have Log Files and system metrics (if you are lucky).

Consider you are debugging a problem that caused downtime in your service as one of the processes crashed.

You’ve identified that the issue occurred during communication between Node-A and Node-B. Processes on both Nodes were creating log files.

Here’s how logging worked:

  1. For every log write, the process notes the current clock time as the timestamp.
  2. It writes the log entry in the log file with the timestamp.

Let’s look at both log files.

Here’s the log file for Node B (we’re looking at it first because it’s more straightforward). ...

Access this course and 1400+ top-rated courses and projects.