Search⌘ K
AI Features

Facebook, WhatsApp, Instagram, Oculus Outage

Discover how a routine configuration change led to the 2021 Facebook global outage through cascading failures and DNS withdrawal. Learn essential System Design lessons on avoiding automation pitfalls, ensuring operational readiness, and implementing simple, robust contingency plans for core services.

In October 2021, Facebook experienced a six-hour global outage affecting related services, including Messenger, WhatsApp, Instagram, and Oculus. The New York Times described the event with the headline: “Gone in Minutes, Out for Hours: Outage Shakes Facebook.” Estimates suggest the outage cost Facebook about $100 million in revenue and billions in market value. The following sequence of events led to the outage:

The sequence of events

A chain reaction of technical failures led to the total blackout:

  • Routine maintenance: An automated system attempted to assess spare capacity on Facebook’s backbone network. ...