Specify Replicas in Deployments or Statefulsets?
In this lesson, we will explore different strategies regarding where to define the replicas, in our Deployments or StatefulSets.
Knowing that HorizontalPodAutoscaler (HPA)
manages auto-scaling of our applications, the question might arise regarding replicas
. Should we define them in our Deployments and StatefulSets, or should we rely solely on HPA
to manage them? Instead of answering that question directly, we’ll explore different combinations and, based on results, define the strategy.
HPA
modifies the Deployment #
First, let’s see how many Pods we have in our cluster right now.
kubectl -n go-demo-5 get pods
The output is as follows.
NAME READY STATUS RESTARTS AGE
api-... 1/1 Running 0 27m
api-... 1/1 Running 2 31m
db-0 2/2 Running 0 20m
db-1 2/2 Running 0 20m
db-2 2/2 Running 0 21m
We can see that there are two replicas of the api
Deployment, and three replicas of the db
StatefulSets.
Let’s say that we want to roll out a new release of our go-demo-5
application. The definition we’ll use is as follows.
cat scaling/go-demo-5-replicas-10.yml
The output, limited to the relevant parts, is as follows.
...
apiVersion: apps/v1
kind: Deployment
metadata:
name: api
namespace: go-demo-5
spec:
replicas: 10
...
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: api
namespace: go-demo-5
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: api
minReplicas: 2
maxReplicas: 5
metrics:
- type: Resource
resource:
name: cpu
targetAverageUtilization: 80
- type: Resource
resource:
name: memory
targetAverageUtilization: 80
The important thing to note is that our api
Deployment has 10
replicas and that we have the HPA
. Everything else is the same as it was before.
What will happen if we apply that definition?
kubectl apply \
-f scaling/go-demo-5-replicas-10.yml
kubectl -n go-demo-5 get pods
We applied the new definition and retrieved all the Pods from the go-demo-5
Namespace. The output of the latter command is as follows.
NAME READY STATUS RESTARTS AGE
api-... 1/1 Running 0 9s
api-... 0/1 ContainerCreating 0 9s
api-... 0/1 ContainerCreating 0 9s
api-... 1/1 Running 2 41m
api-... 1/1 Running 0 22s
api-... 0/1 ContainerCreating 0 9s
api-... 0/1 ContainerCreating 0 9s
api-... 1/1 Running 0 9s
api-... 1/1 Running 0 9s
api-... 1/1 Running 0 9s
db-0 2/2 Running 0 31m
db-1 2/2 Running 0 31m
db-2 2/2 Running 0 31m
Kubernetes complied with our desire to have ten replicas of the api
and created eight Pods (we had two before). At the first look, it seems that HPA
does not have any effect. Let’s retrieve the Pods one more time.
kubectl -n go-demo-5 get pods
The output is as follows.
NAME READY STATUS RESTARTS AGE
api-... 1/1 Running 0 30s
api-... 1/1 Running 2 42m
api-... 1/1 Running 0 43s
api-... 1/1 Running 0 30s
api-... 1/1 Running 0 30s
db-0 2/2 Running 0 31m
db-1 2/2 Running 0 32m
db-2 2/2 Running 0 32m
Our Deployment de-scaled from ten to five replicas. HPA
detected that there are more replicas than the maximum threshold and acted accordingly. But what did it do? Did it simply remove five replicas? That could not be the case since that would only have a temporary effect. If HPA
removes or adds Pods, Deployment would also remove or add Pods, and the two would be fighting with each other. The number of Pods would be fluctuating indefinitely. Instead, HPA
modified the Deployment.
Let’s describe the api
.
kubectl -n go-demo-5 \
describe deployment api
The output, limited to the relevant parts, is as follows.
...
Replicas: 5 desired | 5 updated | 5 total | 5 available | 0 unavailable
...
Events:
... Message
... -------
...
... Scaled up replica set api-5bbfd85577 to 10
... Scaled down replica set api-5bbfd85577 to 5
The number of replicas is set to 5
desired. HPA
modified our Deployment. We can observe that better through the event messages. The second to last states that the number of replicas was scaled up to 10
, while the last message indicates that it scaled down to 5
. The former is the result of us executing a rolling update by applying the new Deployment, while the latter was produced by HPA
modifying the Deployment by changing its number of replicas.
So far, we observed that HPA
modifies our Deployments. No matter how many replicas we defined in a Deployment (or a StatefulSets), HPA
will change it to fit its own thresholds and calculations. In other words, when we update a Deployment, the number of replicas will be temporarily changed to whatever we have defined, only to be modified again by HPA
a few moments later. That behavior is unacceptable.
If HPA
changed the number of replicas, there is usually a good reason for that. Resetting that number to whatever is set in a Deployment (or a StatefulSet) can produce serious side-effects.
Get hands-on with 1200+ tech skills courses.