Get Started with Auto-Scaling Pods
Explore deploying Kubernetes applications with auto-scaling capabilities. Learn to configure HorizontalPodAutoscaler to adjust pod replicas dynamically based on CPU and memory usage, ensuring efficient resource management and minimum replica constraints.
Our goal is to deploy an application that will be automatically scaled (or de-scaled) depending on its use of resources. We’ll start by deploying an app first, and discuss how to accomplish auto-scaling later.
I already warned you that I assume that you are familiar with Kubernetes and that in this course we’ll explore a particular topic of monitoring, alerting, scaling, and a few other things. We will not discuss Pods, StatefulSets, Deployments, Services, Ingress, and other “basic” Kubernetes resources.
Deploy an application
Let’s take a look at a definition of the application we’ll use in our examples.
cat scaling/go-demo-5-no-sidecar-mem.yml
If you are familiar with Kubernetes, the YAML definition should be self-explanatory. We’ll comment on only the parts that are relevant for auto-scaling.
The output, limited to the relevant parts, is as follows.
...
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: db
namespace: go-demo-5
spec:
...
template:
...
spec:
...
containers:
- name: db
...
resources:
limits:
memory: "150Mi"
cpu: 0.2
requests:
memory: "100Mi"
cpu: 0.1
...
- name: db-sidecar
...
apiVersion: apps/v1
kind: Deployment
metadata:
name: api
namespace: go-demo-5
spec:
...
template:
...
spec:
containers:
- name: api
...
resources:
limits:
memory: 15Mi
cpu: 0.1
requests:
memory: 10Mi
cpu: 0.01
...
We have two Pods that form an application. The api Deployment is a backend API that uses db StatefulSet for its state. ...