Upgrading Old Pods
Our primary goal should be to prevent issues from happening by being proactive. In cases when we cannot predict that a problem is about to materialize, we must, at least, be quick with our reactive actions that mitigate the issues after they occur. Still, there is a third category that can only loosely be characterized as being proactive. We should keep our system clean and up-to-date.
Among many things we could do to keep the system up-to-date is making sure that our software is relatively recent (patched, updated, and so on). A reasonable rule could be to try to renew software after ninety days, if not earlier. That does not mean that everything we run in our cluster should be newer than ninety days, but that might be a good starting point. Further on, we might create finer policies that would allow some kinds of applications (usually third-party) to live up to, let’s say, half a year without being upgraded. Others, especially software we’re actively developing, will probably be upgraded much more frequently. Nevertheless, our starting point is to detect all the applications that were not upgraded in ninety days or more.
Just as in almost all other exercises in this chapter, we’ll start by opening the
Prometheus graph screen and explore the metrics that might help us reach our goal.
If we inspect the available metrics, we’ll see that there is
kube_pod_start_time. Its name provides a clear indication of its purpose. It provides the Unix timestamp that represents the start time of each Pod in the form of a Gauge. Let’s see it in action.
Please type the expression that follows and click the Execute button.
Those values alone are of no use, and there’s no point in teaching you how to calculate the human date from those values. What matters, is the difference between now and those timestamps.