Human Readability of Log Files and Voodoo Operations
Explore how to design log files for human readability to reduce misinterpretation during critical system incidents. Understand the impact of ambiguous messages and pattern detection biases on operational decisions. Learn best practices for logging, including traceable identifiers and capturing state transitions to aid troubleshooting and postmortems.
We'll cover the following...
Human factors
Above all else, log files are human-readable. That means they constitute a human-computer interface and should be examined in terms of human factors. This might sound trivial, even laughable, but in a stressful situation, such as a Severity 1 incident, human misinterpretation of status information can prolong or aggravate the problem. Operators for the Three Mile Island reactor misinterpreted the meaning of coolant pressure and temperature values, leading them to take exactly the wrong action at every turn (see Inviting Disaster [Chi01] ). Although most of our systems will not vent radioactive steam when they break, they will expel our money and our reputation.
Therefore, we should ensure that log files convey clear, accurate, and actionable information to the humans who read them. If log files are a human interface, then they should also be written such that humans can recognize and interpret them as rapidly as possible. The format should be as readable as ...