Scale up the Cluster

This lesson focuses on how to scale up the cluster and the rules which govern it.

Scale up the nodes #

The objective is to scale the nodes of our cluster to meet the demand of our Pods. We want not only to increase the number of worker nodes when we need additional capacity, but also to remove them when they are underused. For now, we’ll focus on the former, and explore the latter afterward.

Let’s start by taking a look at how many nodes we have in the cluster.

kubectl get nodes

The output, from GKE, is as follows.

NAME             STATUS ROLES  AGE   VERSION
gke-devops25-... Ready  <none> 5m27s v1.9.7-gke.6
gke-devops25-... Ready  <none> 5m28s v1.9.7-gke.6
gke-devops25-... Ready  <none> 5m24s v1.9.7-gke.6

In your case, the number of nodes might differ. That’s not important. What matters is to remember how many you have right now since that number will change soon.

Let’s take a look at the definition of the go-demo-5 application before we roll it out.

cat scaling/go-demo-5-many.yml

The output, limited to the relevant parts, is as follows.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: api
  namespace: go-demo-5
spec:
  ...
  template:
    ...
    spec:
      containers:
      - name: api
        ...
        resources:
          limits:
            memory: 1Gi
            cpu: 0.1
          requests:
            memory: 500Mi
            cpu: 0.01
...
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: api
  namespace: go-demo-5
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: api
  minReplicas: 15
  maxReplicas: 30
  ...

In this context, the only important part of the definition we are about to apply is the HPA connected to the api Deployment. Its minimum number of replicas is 15. Given that each api container requests 500MB RAM, fifteen replicas (7.5GB RAM) should be more than our cluster can sustain, assuming that it was created using one of the Gists. Otherwise, you might need to increase the minimum number of replicas.

Let’s apply the definition and take a look at the HPAs.

kubectl apply \
    -f scaling/go-demo-5-many.yml \
    --record

kubectl -n go-demo-5 get hpa

The output of the latter command is as follows.

NAME   REFERENCE        TARGETS                        MINPODS   MAXPODS   REPLICAS   AGE
api    Deployment/api   <unknown>/80%, <unknown>/80%   15        30        1          38s
db     StatefulSet/db   <unknown>/80%, <unknown>/80%   3         5         1          40s

Not enough resources to host all pods #

It doesn’t matter if the targets are still unknown. They will be calculated soon, but we do not care for them right now. What matters is that the api HPA will scale the Deployment to at least 15 replicas.

Next, we need to wait for a few seconds before we take a look at the Pods in the go-demo-5 Namespace.

kubectl -n go-demo-5 get pods

The output is as follows.

NAME    READY STATUS            RESTARTS AGE
api-... 0/1   ContainerCreating 0        2s
api-... 0/1   Pending           0        2s
api-... 0/1   Pending           0        2s
api-... 0/1   ContainerCreating 0        2s
api-... 0/1   ContainerCreating 0        2s
api-... 0/1   ContainerCreating 1        32s
api-... 0/1   Pending           0        2s
api-... 0/1   ContainerCreating 0        2s
api-... 0/1   ContainerCreating 0        2s
api-... 0/1   ContainerCreating 0        2s
api-... 0/1   ContainerCreating 0        2s
api-... 0/1   ContainerCreating 0        2s
api-... 0/1   Pending           0        2s
api-... 0/1   ContainerCreating 0        2s
api-... 0/1   ContainerCreating 0        2s
db-0    2/2   Running           0        34s
db-1    0/2   ContainerCreating 0        34s

We can see that some of the api Pods are being created, while others are pending. There can be quite a few reasons why a Pod would enter into the pending state. In our case, there are not enough available resources to host all the Pods.

Get hands-on with 1200+ tech skills courses.