Comparing Actual Resource Usage with Defined Requests
If we define container
resources inside a Pod and without relying on actual usage, we are just guessing how much memory and CPU we expect a container to use. I’m sure that you already know why guessing, in the software industry, is a terrible idea, so I’ll focus on Kubernetes aspects only.
Kubernetes treats Pods with containers that do not have specified resources as the BestEffort Quality Of Service (QoS). As a result, if it ever runs out of memory or CPU to serve all the Pods, those are the first to be forcefully removed to leave space for others. If such Pods are short-lived as, for example, those used as one-shot agents for continuous delivery processes, BestEffort QoS is not a bad idea. But, when our applications are long-lived, BestEffort QoS should be unacceptable. That means that in most cases, we do have to define container
resources are (almost always) a must, we need to know which values to put. I often see teams that merely guess. “It’s a database; therefore it needs a lot of RAM” and “it’s only an API, it shouldn’t need much” are only a few of the sentences I hear a lot. Those guesstimates are often the result of not being able to measure actual usage. When something would blow up, those teams would just double the allocated memory and CPU. Problem solved!
I never understood why anyone would invent how much memory and CPU an application needs. Even without any “fancy” tools, we always had
top command in Linux. We could know how much our application uses. Over time, better tools were developed, and all we had to do is Google “how to measure memory and CPU of my applications.” You already saw
kubectl top pods in action when you need current data, and you are becoming familiar with the power of
Prometheus to give you much more. You do not have an excuse to guesstimate.
But, why do we care about resource usage compared with requested resources? Besides the fact that it might reveal a potential problem (e.g., memory leak), inaccurate resource requests and limits prevent Kubernetes from doing its job efficiently. If, for example, we define the memory request to 1GB RAM, that’s how much Kubernetes will remove from allocatable memory. If a node has 2GB of allocatable RAM, only two such containers could run there, even if each uses only 50MB RAM. Our nodes would use only a fraction of allocatable memory and, if we have
Cluster Autoscaler, new nodes would be added even if the old ones still have plenty of unused memory.
Even though we know how to get actual memory usage, it would be a waste of time to start every day by comparing YAML files with the results in
Prometheus. Instead, we’ll create yet another alert that will send us a notification whenever the requested memory and CPU differs too much from the actual usage. That’s our next mission.
First, we’ll reopen the
Prometheus's graph screen.
We already know how to get memory usage through
container_memory_usage_bytes, so we’ll jump straight into retrieving requested memory. If we can combine the two, we’ll get the discrepancy between the requested and the actual memory usage.
The metric we’re looking for is
kube_pod_container_resource_requests_memory_bytes, so let’s take it for a spin with, let’s say,
Please type the expression that follows, press the Execute button, and switch to the Graph tab.
We can see from the result that we requested 500MB RAM for the