Draining Worker Nodes

In this lesson, we will carry out a chaos experiment which will drain everything from a worker node.

We'll cover the following

- The reasoning behind this experiment
- Inspecting the definition of node-drain.yaml
- Describing the labels of nodes of the cluster
- Exporting the NODE_LABEL variable
- Running chaos experiment and inspecting the output
- Why couldn’t we drain the node?

The reasoning behind this experiment

We’re going to try to drain everything from a random worker node.

Why do you think we might want to do something like this? One possible reason for doing that is in upgrades. The draining process is the same as the one we are likely using to upgrade our Kubernetes cluster.

Upgrading a Kubernetes cluster usually involves a few steps. Typically, we’d drain a node, we’d shut it down, and we’d replace it with an upgraded version of the node. Alternatively, we might upgrade a node without shutting it down, but that would be more appropriate for bare-metal servers that cannot be destroyed and created at will. Further on, we’d repeat the steps. We’d drain a node, shut it down, and create a new one based on an upgraded version. This would continue over and over again, one node after another, until the whole cluster is upgraded. The process is often called rolling updates (or rolling upgrades), and it is employed by most Kubernetes distributions.

We want to make sure nothing wrong happens while or after upgrading a cluster. To do that, we’re going to design an experiment that would perform the most critical step of the process. It will drain a random node, and we will validate whether our applications are just as healthy as before.

If you’re not familiar with the expression, draining means removing everything from a node.

Inspecting the definition of `node-drain.yaml`

Let’s take a look at yet another definition of an experiment.

Get hands-on with 1200+ tech skills courses.

Introduction To Kubernetes Chaos Engineering

Defining Requirements

Destroying Application Instances

Experimenting with Application Availability

Obstructing and Destroying Network

Draining and Deleting Nodes

Creating Chaos Experiment Reports

Running Chaos Experiments Inside a Kubernetes Cluster

Executing Random Chaos

What’s Next?

Draining Worker Nodes

The reasoning behind this experiment

Inspecting the definition of `node-drain.yaml`

Introduction To Kubernetes Chaos Engineering

Defining Requirements

Destroying Application Instances

Experimenting with Application Availability

Obstructing and Destroying Network

Draining and Deleting Nodes

Creating Chaos Experiment Reports

Running Chaos Experiments Inside a Kubernetes Cluster

Executing Random Chaos

What’s Next?

Draining Worker Nodes

The reasoning behind this experiment

Inspecting the definition of node-drain.yaml

Inspecting the definition of `node-drain.yaml`