Facebook, WhatsApp, Instagram, Oculus Outage

Learn the causes of a major Facebook outage and how to avoid them.

In October 2021, Facebook experienced a global outage for about six hours, affecting its other affiliates, including Messenger, WhatsApp, Mapillary, Instagram, and Oculus. Popular media reported the impact of this failure prominently. The New York Times reported the following headline: “Gone in Minutes, Out for Hours: Outage Shakes Facebook.”

According to one estimate, this outage cost Facebook about $100 million in revenue losses and many billions due to the declining stock of the company.

Let’s look at the sequence of events that caused this global problem.

The sequence of events

The following sequence of events led to the outage of Facebook and its accompanied services:

  • A routine maintenance system was needed to find out the spare capacity on Facebook’s backbone network.
  • Due to a configuration error, the maintenance system disconnected all the data centers from each other on the backbone network. Earlier, an automated configuration review tool was used to look for any issues in the configuration, but tools like these aren’t perfect. In this specific case, the review tool missed the problems present in a configuration.
  • The authoritative domain name systems (DNSs) of Facebook had a health check rule that if it couldn’t reach Facebook’s internal data centers, it would stop replying to client DNS queries by withdrawing the routes.
  • When the networks routes where Facebook’s authoritative DNS was hosted were withdrawn, all cached mapping of human-readable names to IPs soon timed out at all public DNS resolvers. When a client resolves www.facebook.com, the DNS resolver first goes to one of the root DNS servers, which provides a list of authoritative DNS servers for .com. The resolver connects to one of them, and then they provide IPs for the authoritative DNS servers for Facebook. However, after route withdrawal, it was impossible to reach them.
  • Then, no one was able to reach Facebook and its subsidiaries.

The slides below depict the events in pictorial form.

