AWS Global Infrastructure for Architects
Explore AWS global infrastructure layers to understand how to architect resilient, low-latency, and compliant cloud systems. Learn to make foundational decisions about Region and Availability Zone selection, multi-Region and multi-AZ patterns, hybrid connectivity, and governance to optimize workload placement and disaster recovery.
Every enterprise-grade AWS deployment begins with a single, foundational decision: where to place workloads across the planet. For the AWS Solutions Architect exam, this decision is not about picking a Region from a dropdown. It is about evaluating failure domains, latency boundaries, compliance constraints, and cost trade-offs across a layered global infrastructure. Architects who master these layers can design systems that survive Availability Zone outages, recover from regional disasters, and serve users worldwide with minimal latency.
This lesson breaks down each layer and the architectural patterns that connect them.
Understanding AWS global infrastructure
AWS global infrastructure is designed as a layered hierarchy of isolation boundaries, where each layer plays a specific role in improving resilience, availability, and performance.
At the highest level are Regions. These are geographically separated clusters of data centers, each fully independent and with its own power, cooling, and networking. This isolation allows you to design systems that remain operational even if an entire Region is affected.
Within each Region are Availability Zones (AZs). These are physically separate data centers connected by low-latency, high-throughput networking. From a networking perspective, AZs map to one or more Amazon VPC subnets, which allows architects to logically group resources and isolate failures. Because AZs are isolated from one another, applications can be distributed across multiple AZs to achieve high availability and fault tolerance.
Extending beyond Regions and AZs are edge locations, which form the outermost layer of AWS infrastructure. With more than 400 points of presence worldwide, these locations power services like Amazon CloudFront for content caching and AWS Global Accelerator for optimized network routing. They bring data closer to users, reducing latency.
AWS also offers Local Zones, which place compute, storage, and database services closer to large population centers. These are designed for latency-sensitive applications that require single-digit-millisecond response times, such as real-time gaming or media production. Similarly, AWS Wavelength Zones embed AWS compute and storage services within the data centers of 5G telecommunication providers, delivering ultra-low-latency applications directly to mobile edge devices and end users.
The postal system analogy
A helpful way to understand this structure is through a postal system analogy. Regions are like independent countries, each with its own postal network. Availability Zones are sorting facilities within a country. If one facility fails, others continue operating. Edge locations act like neighborhood mailboxes, bringing services closer to users. Local Zones are satellite offices in densely populated areas, enabling faster, near-instant delivery.
Currently, AWS infrastructure spans 39 Regions, 123 Availability Zones, 750+ edge locations, 43 Local Zones, and 33 Wavelength Zones.
Choosing the right combination of these layers is the first major architectural decision. Every other choice an architect makes, including networking design, data replication strategy, and governance model, builds on this foundation.
The following diagram illustrates how these layers relate to each other and how user traffic traverses them.
With this layered model established, the next step is understanding how architects evaluate and select specific Regions and AZs for workload placement.
Region and AZ design decisions
Selecting a Region is a multivariable optimization problem. Four primary factors drive the decision, and architects must weigh them against each other rather than optimizing for any single dimension.
Data residency and compliance dictate Region selection when regulations such as
require data to remain within specific geographic boundaries, making EU-based Regions mandatory regardless of latency or cost.GDPR GDPR (General Data Protection Regulation) is a European Union law that governs how personal data of individuals in the EU is collected, stored, processed, and protected. Proximity to end users reduces round-trip latency, which directly impacts user experience for interactive applications and API-driven workloads.
Service availability varies across Regions because AWS does not launch every service simultaneously in all Regions. Architects must verify that required services are available in the target Region before committing.
Cost differences exist between Regions, with some Regions (such as
us-east-1) offering lower pricing for compute and storage than newer or more remote Regions.
Multi-AZ as the default resilience pattern
Once a Region is selected, deploying across at least two AZs is the baseline high-availability pattern. An Application Load Balancer distributes traffic across targets in multiple AZs. Amazon RDS Multi-AZ performs synchronous replication and automatic failover to a standby in a different AZ. Auto Scaling groups spread instances across AZs so that losing one AZ triggers replacement capacity in the surviving AZs.
Attention: Multi-AZ protects against failures at the Availability Zone level, but it does not address Region-wide disruptions. This distinction is critical. In scenarios where business continuity requirements demand low Recovery Time Objective (RTO) and Recovery Point Objective (RPO) across Regions, relying solely on Multi-AZ is not sufficient. You must incorporate multi-Region disaster recovery strategies.
Governance plays a key role in reinforcing these architectural decisions at scale. For example,
The following table compares multi-AZ and multi-Region design trade-offs to help architects match patterns to requirements.
Multi-AZ vs. Multi-Region Design Trade-offs
Design Aspect | Multi-AZ (Single Region) | Multi-Region |
Failure Domain Scope | AZ-level isolation | Region-level isolation |
Typical RTO/RPO | Minutes with automated failover | Minutes to hours depending on pattern |
Data Replication | Synchronous (e.g., RDS Multi-AZ) | Asynchronous (e.g., S3 CRR, Aurora Global Database) |
Complexity | Low to moderate | High due to cross-region networking and data consistency |
Cost | Moderate with redundant AZ resources | High with duplicate infrastructure and cross-region data transfer |
Compliance Alignment | Single-region data residency | Requires careful data routing across jurisdictions |
Use Case Fit | Standard high availability workloads | Global applications, regulatory DR, and sovereignty requirements |
When business requirements exceed what a single Region can guarantee, architects must evaluate multi-Region patterns, which are covered in the next section.
Multi-Region architecture patterns
Multi-Region design addresses three scenarios that single-Region deployments cannot satisfy: Region-level disaster recovery, global user distribution with low latency, and data sovereignty across multiple jurisdictions. The two primary patterns, active-passive and active-active, differ fundamentally in cost, complexity, and data consistency guarantees.
Active-passive deployments
In an active-passive configuration, a primary Region handles all production traffic, while a secondary Region maintains replicated data and standby infrastructure. Data replication relies on asynchronous mechanisms such as S3 Cross-Region Replication (CRR) or Aurora Global Database with read replicas in the secondary Region. Failover is orchestrated through Amazon Route 53 health checks configured with a failover routing policy. When the primary Region becomes unhealthy, Route 53 redirects DNS queries to the secondary Region’s endpoints.
The trade-off is clear: active-passive costs less because standby infrastructure can run at reduced capacity, but RTO is higher because the secondary Region may need to scale up compute, promote read replicas to writer instances, and warm caches before serving production traffic.
Active-active deployments
Active-active configurations serve live traffic from both Regions simultaneously. Route 53 latency-based routing or weighted routing distributes users to the nearest healthy Region. The primary challenge shifts from failover speed to data consistency.
DynamoDB Global Tables provide multi-Region, multi-active replication with last-writer-wins conflict resolution, making them suitable for workloads that tolerate eventual consistency across Regions.
Practical tip: A common SAP-C02 exam trap presents active-active as if both Regions accept writes for all services. In reality, architects must identify which services support multi-writer semantics (DynamoDB Global Tables) and which require write forwarding or single-writer patterns (Aurora Global Database and most relational stores).
Cross-Region connectivity and traffic steering
Cross-Region VPC connectivity should use
For user-facing traffic, AWS Global Accelerator provides static anycast IP addresses and performs health-check-based failover at the network layer within seconds, without DNS TTL dependencies. This is preferred over CloudFront when the requirement is TCP/UDP traffic steering rather than HTTP content caching.
Edge locations and Local Zones
Edge locations and Local Zones address different points on the latency spectrum, and selecting between them depends on whether the workload benefits from caching, network-layer routing, or proximity compute.
CloudFront and edge-tier content delivery
Amazon CloudFront caches static and dynamic content at 750+ points of presence (PoPs) and 15 Regional Edge Caches, reducing origin load and improving user-perceived latency for HTTP/HTTPS workloads. A CloudFront distribution can be configured with origin failover, where the distribution automatically switches to a secondary origin (potentially in another Region) if the primary origin returns 5xx errors or times out. This provides a lightweight multi-Region resilience pattern for content delivery without the full complexity of an active-passive deployment.
Global Accelerator vs. CloudFront
The architectural decision between CloudFront and Global Accelerator maps to a simple question: Does the workload benefit from caching, or does it need deterministic network-layer routing? CloudFront optimizes HTTP/HTTPS delivery with edge caching. Global Accelerator routes TCP and UDP traffic over the AWS backbone to the optimal regional endpoint, providing instant health-check-based failover and consistent network performance without caching.
Note:For the SAP-C02 exam, if a scenario mentions caching, static content, or HTTP acceleration, CloudFront is the answer. If the scenario mentions TCP/UDP traffic, gaming, VoIP, or sub-second regional failover, Global Accelerator is the answer.
Local Zones for ultra-low-latency compute
When even edge location latency is insufficient (real-time gaming, media rendering, AR/VR applications), Local Zones place EC2 instances and select services within a few milliseconds of end users in major metropolitan areas. Architects opt in to a Local Zone and extend a VPC subnet into it, allowing instances to run locally while management remains in the parent Region. The trade-off is a limited set of instance types and services compared to full AZs, so Local Zones complement rather than replace standard AZ deployments.
The following decision tree maps workload requirements to the appropriate infrastructure layer.
With workload placement and edge strategies defined, the final architectural consideration is how connectivity and governance tie these layers together at enterprise scale.
Connectivity and governance at scale
Global infrastructure decisions do not exist in isolation. They intersect with hybrid connectivity patterns and organizational governance, which determine whether an architecture is truly resilient or merely redundant.
AWS Transit Gateway acts as a regional hub for connecting VPCs and on-premises networks. Inter-Region Transit Gateway peering extends this hub-and-spoke model across Regions, replacing the anti-pattern of full-mesh VPC peering that becomes unmanageable beyond a few VPCs. For hybrid connectivity, a single AWS Direct Connect connection is not highly available. Architects should deploy redundant connections at geographically diverse Direct Connect locations or use Site-to-Site VPN as a backup path. A
Governance at scale requires AWS Organizations with Service Control Policies (SCPs) to restrict Region usage, enforce tagging standards, and prevent resource creation outside approved boundaries. The blast radius principle guides isolation strategy: segment workloads by account, Region, and AZ so that a failure or security incident in one boundary does not propagate. Centralized network inspection using AWS Network Firewall or Gateway Load Balancer in a shared services VPC must be designed to avoid asymmetric routing and single points of failure.
Practical tip: On the SAP-C02 exam, the correct answer always aligns the infrastructure layer with the specific requirement: latency, resilience, compliance, or cost. Choosing a complex multi-Region active-active pattern when a multi-AZ deployment meets the stated RTO/RPO is just as incorrect as selecting a single-AZ design when the scenario requires regional failover.
Summary
AWS global infrastructure is built on layered failure domains that architects choose based on workload needs: Availability Zones (AZs) provide high availability within a Region through isolation and automated failover, while multi-Region designs extend resilience using active-passive disaster recovery or active-active global architectures with added complexity and data consistency trade-offs. Edge locations improve content delivery, AWS Global Accelerator optimizes traffic with rapid failover, and Local Zones support ultra-low-latency workloads near users. Governance via AWS Organizations, SCPs, and blast radius isolation enforces controlled, compliant architecture at scale. For SAP-C02, always match the infrastructure layer to the requirement instead of defaulting to the simplest or most complex design.