Detailed Design of a Monitoring System
Define the core components of a distributed monitoring system, such as the data collector, service discoverer, and alert manager. Analyze the drawbacks of simple pull-based architectures, like SPOF and scaling limits. Implement a hybrid pull/push hierarchical pattern to ensure the monitoring System Design scales globally.
We'll cover the following...
This section defines the core storage components of the monitoring system, identifies design limitations, and refines the architecture to meet the stated requirements.
Storage
The system requires three distinct storage mechanisms:
Time series database (TSDB): Stores metrics locally on the monitoring server for fast write and read operations.
Blob storage: Serves as a separate storage node for long-term retention of metric data.
Rules database: We also need to store alert configurations. For example, if CPU usage exceeds 90%, the system must trigger an alert to the administrator. This requires a rules database to store rules and their corresponding actions.
In summary, we have added a rules database for actionable logic and a blob store for data persistence.