...

/

Design Distributed Logging

Design Distributed Logging

Learn to design a distributed logging service.

A log file records details of events occurring in a software application. The details may consist of microservices, transactions, service actions, or anything helpful to debug the flow of an event in the system. Logging is crucial to monitor the application’s flow.

Need for logging

Logging is essential in understanding the flow of an event in a distributed system. It seems like a tedious task, but upon facing a failure or a security breach, logging helps pinpoint when and how the system failed or was compromised. It can also aid in finding out the root cause of the failure or breach. It decreases the meantime to repairMean time to repair (MTTR) is a basic measure of the maintainability of repairable items. It represents the average time required to repair a failed component or device. (Source: Wikipedia) a system.

Why don’t we simply print out our statements to understand the application flow? It’s possible but not ideal. Simple print statements have no way of tracking the severity of the message. The output of print functions usually goes to the terminal, while our need could be to persist such data on a local or remote store. Moreover, we can have millions of print statements, so it’s better to structure and store them properly.

Press + to interact
Issues with using print statements as an alternative to logging
Issues with using print statements as an alternative to logging

Concurrent activity by a service running on many nodes might need causality information to stitch together a correct flow of events properly. We must be careful while dealing with causality in a distributed system. We use a logging service to appropriately manage our distributed software's diagnostic and exploratory data.

Logging allows us to understand our code, locate unforeseen errors, fix the identified errors, and visualize the application’s performance. This way, we are aware of how production works, and we know how processes are running in the system.

Log analysis helps us with the following scenarios:

  • To troubleshoot applications, nodes, or network issues.

  • To adhere to internal security policies, external regulations, and compliance.

  • To recognize and respond to data breaches and other security problems.

  • To comprehend users’ actions for input to a recommender system.

Logging in a distributed system

In today’s world, more designs are moving to microservice architecture instead of monolithic architecture. In a microservice architecture, logs of each microservice are accumulated in the respective machine. If we want to know about a certain event that was processed by several microservices, it is difficult to go into every node, figure out the flow, and view error messages. ...

Access this course and 1400+ top-rated courses and projects.