Search⌘ K
AI Features

Introduction

Explore strategies to design resilient software that endures real-world failures including immediate crashes, gradual resource leaks, and spectacular outages. Understand techniques such as idempotent workflows, incremental backoffs, and pacing limiters. Gain insight into managing infrastructure tooling to maintain system stability during unexpected chaos.

We'll cover the following...

Writing software that works in perfect conditions is easy. It would be nice if we never had to worry about network latency, service timeouts, storage outages, misbehaving applications, users sending bad arguments, security issues, or any of the real-life scenarios we find ourselves in.

Things tend to fail in the following three ways:

  • Immediately

  • Gradually

  • Spectacularly

Immediately is usually the result of a change to application code that causes a service to die on startup or ...