The DevOps Toolkit: Kubernetes Chaos Engineering/

...

Preparing for Termination of Nodes

In this lesson, we will set up a new ConfigMap, create a namespace, compare it with the previous one, and explore a CronJob that we will later use for the experiment.

We'll cover the following...

- What can we do?
- Inspecting the ConfigMap defined in experiments-node.yaml
- Applying the new ConfigMap
- Inspecting the CronJob defined in periodic-node.yaml

What can we do?

Now, we know how to affect not only individual applications, but also random ones running in a Namespace, or even in the whole cluster. Next, we’ll explore how to randomize our experiments on the node level as well.

In the past, we were terminating or disrupting nodes where a specific application was running. Next, we will try to figure out how to destroy a completely random node. It will be without any particular criteria. We’ll just do random stuff and see how it affects our cluster. If we’re lucky, such actions will not result in any adverse result. Or, maybe they will. We’ll soon find out.

We couldn’t do this before because the steady-state hypothesis of our experiments was not enough, but we can do it now. If we destroy something (almost) completely random, any part of the system can be affected. We cannot use the Chaos Toolkit hypothesis to predict what the initial state should be, nor what the state after some destructive cluster-wide actions should be. We could do that, but it would be too complicated and we would be trying to solve the problem with the wrong tool.

Now, we know that we can use Prometheus to store metrics and that we can monitor our system through dashboards like Grafana and Kiali. We could, and should, go further. For example, we should create alerts that will notify us when any part of the system is misbehaving.

Now, we are ready to go full throttle and run our experiments on the cluster level.

Inspecting the ConfigMap defined in `experiments-node.yaml`

Let’s take a look ...

Introduction To Kubernetes Chaos Engineering

Defining Requirements

Destroying Application Instances

Experimenting with Application Availability

Obstructing and Destroying Network

Draining and Deleting Nodes

Creating Chaos Experiment Reports

Running Chaos Experiments Inside a Kubernetes Cluster

Executing Random Chaos

What’s Next?

Preparing for Termination of Nodes

What can we do?

Inspecting the ConfigMap defined in `experiments-node.yaml`

Preparing for Termination of Nodes

What can we do?

Inspecting the ConfigMap defined in experiments-node.yaml

Inspecting the ConfigMap defined in `experiments-node.yaml`