Did this Node Time-Travel into the Past?
Time is tricky even within a single node. It's even more tricky in Distributed Systems. Let's see why
Let’s debug a Process Crash
By the way:
If someone hasn’t told you already, I have news for you. There is usually no Debugger in a Distributed System when you are investigating an issue. At most, you’ll have Log Files and system metrics (if you are lucky).
Consider you are debugging a problem that caused downtime in your service as one of the processes crashed.
You’ve identified that the issue occurred during communication between Node-A and Node-B. Processes on both Nodes were creating log files.
Here’s how logging worked:
- For every log write, the process notes the current clock time as the timestamp.
- It writes the log entry in the log file with the timestamp.
Let’s look at both log files.
Here’s the log file for Node B (we’re looking at it first because it’s more straightforward). ...
Access this course and 1400+ top-rated courses and projects.