Advanced Kubernetes Techniques: Monitoring, Logging, Auto-Scaling/

...

Using Internal Metrics to Debug Potential Issues

In this lesson, we will use internal metrics to debug the potential issues we face.

We'll cover the following...

- Issue with nginx_ingress_controller_request_duration_seconds
- Switch to http_server_resp_time metric
- - Include labels into expressions
  - Adding a threshold
- Benefit of Instrumentation
- Combine generic metrics with detailed metrics

We’ll resend requests with slow responses again so that we get to the same point where we started this chapter.

for i in {1..20}; do
    DELAY=$[ $RANDOM % 10000 ]
    curl "http://$GD5_ADDR/demo/hello?delay=$DELAY"
done

open "http://$PROM_ADDR/alerts"

We sent twenty requests that will result in responses with random duration (up to ten seconds). Later on, we opened the Prometheus' alerts screen.

A while later, the AppTooSlow alert should fire (remember to refresh your screen), and we have a (simulated) problem that needs to be solved. Before we start panicking and do something hasty, we’ll try to find the cause of the issue.

Please click the expression of the AppTooSlow alert.

Issue with `nginx_ingress_controller_request_duration_seconds` #

We are redirected to the graph screen with the pre-populated expression from the alert. Feel free to click the Expression button, even though it will not provide any additional info, apart from the fact that the application was fast, and then it slowed down for some inexplicable reason. You will not be able to gather more details from that expression. You will not know whether it’s slow on all methods, whether only a specific path responds slow, nor much of any other application-specific details. Simply put, the nginx_ingress_controller_request_duration_seconds metric is too generic. It served us ...

Before Getting Started

Autoscaling Deployments and StatefulSets

Auto-Scaling Nodes Of A Kubernetes Cluster

Collecting and Querying Metrics and Sending Alerts

Debugging Issues Discovered Through Metrics and Alerts

Extending HorizontalPodAutoscaler With Custom Metrics

Visualizing Metrics And Alerts

Collecting And Querying Logs

Conclusion

Using Internal Metrics to Debug Potential Issues

Issue with `nginx_ingress_controller_request_duration_seconds` #

Using Internal Metrics to Debug Potential Issues

Issue with nginx_ingress_controller_request_duration_seconds #

Issue with `nginx_ingress_controller_request_duration_seconds` #