Search⌘ K

Break Our Application Like a Server (Part I)

Explore how to simulate server failures such as process crashes and database downtime in a real-time Phoenix and Elixir application. Learn to perform acceptance tests that validate your app’s resilience by manually killing processes and observing behavior during outages, enhancing your ability to build fault-tolerant systems.

We'll cover the following...

Errors do not always happen from user-initiated actions—different processes and tools can fail on the server. Our application may experience network disconnections between servers, database slowness or downtime, and crashed processes due to bugs or a large amount of work. It’s nearly impossible to consider everything that can go wrong in an application, so we often won’t realize that there is a problem with failure handling until it’s too late. We can simulate many issues locally and in staging environments before experiencing them in production.

This lesson will test what happens to our application during database downtime and when different processes crash on the server. We’ll utilize the observer tool that ships with Erlang/OTP to view our application’s supervision tree. We’ll kill various processes to ensure that our application doesn’t reach an incorrect state. A good rule is to ensure that any custom GenServers, custom Supervisors, and our Ecto Repo can be killed without our application crashing. We’ll be performing manual acceptance tests throughout this section. However, our tests will be doing things outside of what a typical user could do. ...