Search⌘ K
AI Features

Get Started with Auto-Scaling Pods

Explore deploying Kubernetes applications with auto-scaling capabilities. Learn to configure HorizontalPodAutoscaler to adjust pod replicas dynamically based on CPU and memory usage, ensuring efficient resource management and minimum replica constraints.

Our goal is to deploy an application that will be automatically scaled (or de-scaled) depending on its use of resources. We’ll start by deploying an app first, and discuss how to accomplish auto-scaling later.

I already warned you that I assume that you are familiar with Kubernetes and that in this course we’ll explore a particular topic of monitoring, alerting, scaling, and a few other things. We will not discuss Pods, StatefulSets, Deployments, Services, Ingress, and other “basic” Kubernetes resources.

Deploy an application #

Let’s take a look at a definition of the application we’ll use in our examples.

cat scaling/go-demo-5-no-sidecar-mem.yml

If you are familiar with Kubernetes, the YAML definition should be self-explanatory. We’ll comment on only the parts that are relevant for auto-scaling.

The output, limited to the relevant parts, is as follows.

...
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: db
  namespace: go-demo-5
spec:
  ...
  template:
    ...
    spec:
      ...
      containers:
      - name: db
        ...
        resources:
          limits:
            memory: "150Mi"
            cpu: 0.2
          requests:
            memory: "100Mi"
            cpu: 0.1
        ...
      - name: db-sidecar
        ...

apiVersion: apps/v1
kind: Deployment
metadata:
  name: api
  namespace: go-demo-5
spec:
  ...
  template:
    ...
    spec:
      containers:
      - name: api
        ...
        resources:
          limits:
            memory: 15Mi
            cpu: 0.1
          requests:
            memory: 10Mi
            cpu: 0.01
...

We have two Pods that form an application. The api Deployment is a backend API that uses db StatefulSet for its state.

The essential parts of the definition are resources. Both the api and the db have requests ...