Disaster Recovery Strategies
Explore the four main disaster recovery strategies in AWS to design resilient systems. Understand how to select the right approach based on recovery time objectives, data loss tolerance, and cost. Learn the distinctions between high availability and disaster recovery, and how to apply AWS services for effective failover and continuity across multiple Regions.
Every enterprise architecture on AWS must answer a fundamental question: When an entire Region becomes unavailable, how quickly can the business resume operations, and how much data can it afford to lose? The AWS Certified Solutions Architect – Professional exam tests your ability to design for exactly this scenario. It distinguishes between high-availability patterns that protect against component failure within a Region and disaster recovery strategies that protect against Region-level disruption.
This lesson walks through the four canonical DR strategies, the AWS services that underpin each, and the decision framework that maps business requirements to the right architectural response.
Introduction to disaster recovery on AWS
Disaster recovery on AWS refers to the set of architectural patterns and operational procedures that restore workloads in a secondary Region after a primary Region experiences a prolonged or catastrophic failure. High availability, by contrast, uses multi-AZ deployments within a single Region to survive individual data center outages.
Two metrics govern every DR design decision:
The four canonical strategies form a spectrum. At the low-cost end, Backup and Restore tolerates hours of RTO and RPO. Pilot Light keeps core data replicated for rapid rebuild. Warm Standby maintains a scaled-down live environment. Multi-Site Active-Active delivers near-zero RTO and RPO at the highest cost. The AWS Well-Architected Reliability Pillar codifies these patterns as the authoritative framework.
Key AWS services recur across all strategies: Amazon S3 with Cross-Region Replication and versioning, AWS Backup with cross-Region and cross-account vaulting, Route 53 health checks and failover routing, Global Accelerator, Aurora Global Database, RDS cross-Region read replicas, Auto Scaling, and infrastructure-as-code tools such as CloudFormation and Terraform. The next lesson on AWS Elastic Disaster Recovery covers a managed service that automates several of these patterns; this lesson focuses on the strategies themselves and the decision logic behind them.
The following diagram illustrates the four strategies along the cost vs. recovery spectrum.
With this spectrum established, the next sections examine each strategy in architectural detail, beginning with the lowest-cost option. ...