Making Nodes Drainable

Explore how to scale Istio deployments and modify HorizontalPodAutoscalers to ensure multiple replicas run on different nodes. Understand scaling Kubernetes clusters and making nodes drainable through practical chaos engineering techniques, allowing safer node draining and improving cluster resilience.

We'll cover the following...

- Taking a look at Istio Deployments
- Scaling the cluster
  - Defining an environment variable
- - For GKE clusters
- - For EKS clusters
- - For AKS clusters
- Checking the nodes to confirm
- Scaling the Istio components
- Re-running chaos experiment and inspecting the output

The output is as follows.

NAME                 READY UP-TO-DATE AVAILABLE AGE
istio-ingressgateway 1/1   1          1         12m
istiod               1/1   1          1         13m
prometheus           1/1   1          1         12m

We can see that there are two components, not counting prometheus. If we focus on the READY column, we can see that they’re all having one replica.

The two Istio components have a HorizontalPodAutoscaler (HPA) associated. They control how many replicas we’ll have, based on metrics like CPU and memory usage. What we need to do is set the minimum number of instances to 2.

Since the experiment revealed that istio-ingressgateway should have at least two replicas, that’s the one we’ll focus on. Later on, the experiment might reveal other issues. If it does, we’ll deal with them then.

Scaling the cluster

Before we dive into scaling Istio, we are going to explore scaling the cluster itself. It would be pointless to increase the number of replicas of Istio components, as a way to solve the problem of not being able to drain a node, if that is the only node in a cluster. We need the Gateway not only scaled but also distributed across different nodes of the cluster. Only then can we hope to drain a node successfully while the Gateway is running in it. We’ll assume that the experiment might shut down one replica, while others are still running somewhere else. Fortunately for us, Kubernetes always does its best to distribute instances of our apps across different nodes. As long as it can, it will ...

1.Introduction To Kubernetes Chaos Engineering

2.Defining Requirements

3.Destroying Application Instances

4.Experimenting with Application Availability

5.Obstructing and Destroying Network

6.Draining and Deleting Nodes

7.Creating Chaos Experiment Reports

8.Running Chaos Experiments Inside a Kubernetes Cluster

9.Executing Random Chaos

10.What’s Next?

Making Nodes Drainable

Taking a look at Istio Deployments

Scaling the cluster