Measure latency #

We’ll use the go-demo-5 application to measure latency, so our first step is to install it.

GD5_ADDR=go-demo-5.$LB_IP.nip.io

kubectl create namespace go-demo-5

helm install go-demo-5 \
    https://github.com/vfarcic/go-demo-5/releases/download/0.0.1/go-demo-5-0.0.1.tgz \
    --namespace go-demo-5 \
    --set ingress.host=$GD5_ADDR

We generated an address that we’ll use as an Ingress entry-point, and we deployed the application using Helm. Now we should wait until it rolls out.

kubectl -n go-demo-5 \
    rollout status \
    deployment go-demo-5

Before we proceed, we’ll check whether the application is indeed working correctly by sending an HTTP request.

curl "http://$GD5_ADDR/demo/hello"

The output should be the familiar hello, world! message.

Get the duration of requests entering the system #

Now, let’s see whether we can, for example, get the duration of requests entering the system through Ingress.

open "http://$PROM_ADDR/graph"

If you click on the - insert metrics at cursor - drop-down list, you’ll be able to browse through all the available metrics. The one we’re looking for is nginx_ingress_controller_request_duration_seconds_bucket. As its name implies, the metric comes from NGINX Ingress Controller, and provides request durations in seconds and grouped in buckets.

Please type the expression that follows and click the Execute button.

nginx_ingress_controller_request_duration_seconds_bucket

In this case, seeing the raw values might not be very useful, so please click the Graph tab.

You should see graphs, one for each Ingress. Each is increasing because the metric in question is a counter. Its value is growing with each request.

🔍 A Prometheus counter is a cumulative metric whose value can only increase, or be reset to zero on restart.

Calculate the rate of requests #

What we need is to calculate the rate of requests over a period of time. We’ll accomplish that by combining sum and rate functions. The former should be self-explanatory.

🔍 Prometheus's rate function calculates the per-second average rate of increase of the time series in the range vector.

Please select the expression that follows, and press the Execute button.

sum(rate(
  nginx_ingress_controller_request_duration_seconds_count[5m]
)) 
by (ingress)

The resulting graph shows us the per-second rate of all the requests entering the system through Ingress. The rate is calculated based on five minutes intervals. If you hover one of the lines, you’ll see the additional information like the value and the Ingress. The by statement allows us to group the results by ingress.

Still, the result by itself is not very useful, so let’s redefine our requirement. We should be able to find out how many of the requests are slower than 0.25 seconds. We cannot do that directly. Instead, we can retrieve all those that are 0.25 second or faster.

Please type the expression that follows, and press the Execute button.

sum(rate(
  nginx_ingress_controller_request_duration_seconds_bucket{
    le="0.25"
  }[5m]
)) 
by (ingress)

Percentage of requests #

What we really want is to find the percentage of requests that fall into 0.25 seconds bucket. To accomplish that, we’ll get the rate of the requests faster than or equal to 0.25 seconds and divide the result with the rate of all the requests.

Please type the expression that follows, and press the Execute button.

sum(rate(
  nginx_ingress_controller_request_duration_seconds_bucket{
    le="0.25"
  }[5m]
)) 
by (ingress) / 
sum(rate(
  nginx_ingress_controller_request_duration_seconds_count[5m]
)) 
by (ingress)

Since we did not yet generate much traffic, you probably won’t see much in the graph beyond occasional interaction with Prometheus and Alertmanager and a single request we sent to go-demo-5. Nevertheless, the few lines you can see display the percentage of the requests that responded within 0.25 seconds.

Limit the results to go-demo-5 Ingress #

For now, we are interested only in go-demo-5 requests, so we’ll refine the expression further to limit the results only to go-demo-5 Ingress.

Please type the expression that follows, and press the Execute button.

sum(rate(
  nginx_ingress_controller_request_duration_seconds_bucket{
    le="0.25", 
    ingress="go-demo-5"
  }[5m]
)) 
by (ingress) / 
sum(rate(
  nginx_ingress_controller_request_duration_seconds_count{
    ingress="go-demo-5"
  }[5m]
)) 
by (ingress)

The graph should be almost empty since we sent only one request. Or, maybe you received the no datapoints found message. It’s time to generate some traffic.

for i in {1..30}; do
  DELAY=$[ $RANDOM % 1000 ]
  curl "http://$GD5_ADDR/demo/hello?delay=$DELAY"
done

We sent thirty requests to go-demo-5. The application has a “hidden” feature to delay response to a request. Given that we want to generate traffic with random response time, we used the DELAY variable with a random value of up to a thousand milliseconds. Now we can re-run the same query and see whether we can get some more meaningful data.

Please wait for a while until data from new requests are gathered, then type the expression that follows (in Prometheus), and press the Execute button.

sum(rate(
  nginx_ingress_controller_request_duration_seconds_bucket{
    le="0.25", 
    ingress="go-demo-5"
  }[5m]
)) 
by (ingress) / 
sum(rate(
  nginx_ingress_controller_request_duration_seconds_count{
    ingress="go-demo-5"
  }[5m]
)) 
by (ingress)

This time, we can see the emergence of a new line. In my case (screenshot below), around twenty-five percent of requests have durations that are within 0.25 seconds. Or, to put it into different words, around a quarter of the requests are slower than expected.

Get hands-on with 1000+ tech skills courses.