Search⌘ K

Dogpile

Explore how dogpile effects occur in distributed systems during startup, cron jobs, or synchronized events that cause sudden demand surges. Understand the risks of cascading failures and how techniques like randomizing task timings and increasing backoff intervals help maintain system stability and resilience in real-world conditions.

Power outage under high stress

A large-scale power outage acts a lot like a software failure. It starts with a small event, like a power line grounding out on a tree. Ordinarily that would be no big deal, but under high-stress conditions it can turn into a cascading failure that affects millions of people. We can learn from how power is restored after an outage. Operators must perform a tricky balancing act between generation, transmission, and demand.

There used to be a common situation where power would be restored and then cut off again in a matter of seconds. The surge of current demand from millions of air conditioners and refrigerators would overload the newly restored supply. It was especially common in large metro areas during heat waves.

The increased current load would hit just when supply was low, causing excess demand to trip circuit breakers. Lights out, again. Smarter appliances and more modern control systems have ...