Search⌘ K
AI Features

Alerting on Metrics Abnormalities

Understand how to define alerting rules and configure Alertmanager with Prometheus to notify your team of abnormal application behaviors. This lesson covers setting up alerting in your observability stack to help you respond quickly to deviations in metrics such as latency, enhancing your ability to maintain reliable applications.

We'll cover the following...

Metrics provide time-series measurements of the behavior of our applications and infrastructure, but they provide no notification when those measurements deviate from the expected behavior of our applications. To be able to react to abnormal behaviors in our applications, we need to establish rules about what is normal behavior in our applications and how we can be notified when our applications deviate from that behavior.

Alerting on metrics enables us to define behavioral norms and specify how we should be notified when our applications exhibit abnormal behavior. For example, if we expect HTTP responses from our application to respond in under 100 milliseconds and we observe a time span of 5 minutes when our application is responding in greater than 100 milliseconds, we would want to be notified of the deviation from the expected behavior.

In this lesson, we will learn how to extend our current configuration of services to include an AlertmanagerAlertmanager is a Prometheus component that handles alerts by triggering the appropriate actions. service to provide alerts when observed behavior deviates from expected norms. We'll learn how to define alerting rules and specify where to send those notifications when our application experiences abnormal behaviors.

Adding

...