Destroying Cluster Zones

This lesson contains your next self-assignment and discusses some critical things that we have not done yet.


As I mentioned before, nothing happens if we destroy only one out of three nodes (again, ignoring the DB). But, if we continue destroying nodes, bad things could happen.

Here comes yet another assignment: Figure out what would happen if we destroy a whole data center.

Critical issues with our clusters

There are at least two critical things that we didn’t do in the previous section, but we should.

Not using Cluster Autoscaler

First of all, we should have created our Kubernetes cluster with Cluster Autoscaler so that it automatically scales up and down depending on the traffic. Not only would our cluster scale up and down to accommodate an increase and decrease in the workload, but when a node goes down, it would be recreated by Cluster Autoscaler. The cluster would also figure out that there is not enough capacity. Cluster Autoscaler itself would solve fully (or partly) the problems that we could have encountered if we continued running the previous experiment and continued deleting nodes.

Running a zonal cluster

The second issue is that we are running a zonal cluster. If you followed my Gists, your cluster is running in a single zone, which means that it is not fault-tolerant. If that zone (data center) goes down, we’d be lost. So, the second change we should have done to our cluster is to make it regional. It should span multiple zones within the same region. It shouldn’t run in different regions because that would increase latency unnecessarily. Every cloud provider, at least the big three, has a concept of a region, even though it might be named differently. By region, I mean a group of zones (data centers) that are close enough to each other so that there is no high latency, while they still operate as entirely separate entities. Failure of one should not affect the other. At least, that’s how it should be in theory.

Making a regional cluster

Therefore, we should make our cluster regional, and we should make it scalable. Since the steps differ from one provider to another, I will not show you how to create such a cluster, but I will provide the Gists. Please consult them to see how to create a better Kubernetes cluster in Google, Azure, or AWS.

We’ll start the next section by creating the cluster in that way. It will scale automatically, and it will be regional.

That’s about it. There are many other things we could do with nodes, but we explored just enough for you to have at least the basic understanding of how chaos engineering applied to Kubernetes nodes works.

In the next lesson, we will remove the resources that we have created.

Get hands-on with 1200+ tech skills courses.