Chaos Experiments Checklist

This lesson mentions a checklist of chaos experiments for us to follow in this course.

What we are going to do

Before we dive into practical examples, we’ll define a checklist of things we want to accomplish. What could be our goals?

To begin, we probably want to see what happens when we terminate an instance of an application, or we might want to see what happens when we partially terminate the network or delay network requests. Additionally, we might want to increase the latency of the network. We might also want to simulate denial of service attacks. We might want to drain or even delete a node.

We’re going to focus on instances of our applications, networking, and nodes. For all of these to be visible, we may need to create some reports as well as to send notifications to all those interested. Finally, we almost certainly want to target and run our experiments inside a Kubernetes cluster. The summary of the tasks we want to accomplish is as follows:

  • Terminate an instance of an app
  • Partially terminate a network
  • Increase latency
  • Simulate Denial of Service (DoS) attacks
  • Drain a node
  • Delete a node
  • Create reports
  • Send notifications
  • Run the experiments inside a Kubernetes cluster

The things we will not do

It might be just as important to define what we will NOT do. One thing that I will not go through is how to modify the internals of an application, and we will not touch the architecture of our applications. We are going to assume that applications are as they are. However, this is not because we shouldn’t be modifying the internals of our apps. We definitely should be adapting the code and the architecture of applications based on the results of our experiments.

Nevertheless, that’s not the subject of this course, simply because I would need to guess which programming language you are using and provide examples in Go, Java, NodeJS, Python, etc. All in all, we will not be modifying applications.

Also, we will never permanently change a definition of anything. Whatever we do, we will try (when possible) to undo the effects of our experiments. If we damage a network, we will have to undo the changes that caused the damage. Nevertheless, we might not always be successful in that. Sometimes, we might not be able to roll back the results of our experiments. We’ll do our best, but we are yet to see whether we’ll succeed.

Next, we’ll explore how this course is organized.


In the next lesson, we will discuss the organization of this course.

Create a free account to view this lesson.

By signing up, you agree to Educative's Terms of Service and Privacy Policy