AWS Global Infrastructure for Architects

Explore AWS global infrastructure layers to understand how to architect resilient, low-latency, and compliant cloud systems. Learn to make foundational decisions about Region and Availability Zone selection, multi-Region and multi-AZ patterns, hybrid connectivity, and governance to optimize workload placement and disaster recovery.

We'll cover the following...

Understanding AWS global infrastructure
- The postal system analogy
Region and AZ design decisions
- Multi-AZ as the default resilience pattern
Multi-Region architecture patterns
- Active-passive deployments
- Active-active deployments
  - Cross-Region connectivity and traffic steering
Edge locations and Local Zones
Connectivity and governance at scale
Summary

Every enterprise-grade AWS deployment begins with a single, foundational decision: where to place workloads across the planet. For the AWS Solutions Architect exam, this decision is not about picking a Region from a dropdown. It is about evaluating failure domains, latency boundaries, compliance constraints, and cost trade-offs across a layered global infrastructure. Architects who master these layers can design systems that survive Availability Zone outages, recover from regional disasters, and serve users worldwide with minimal latency.

This lesson breaks down each layer and the architectural patterns that connect them.

Understanding AWS global infrastructure

AWS global infrastructure is designed as a layered hierarchy of isolation boundaries, where each layer plays a specific role in improving resilience, availability, and performance.

At the highest level are Regions. These are geographically separated clusters of data centers, each fully independent and with its own power, cooling, and networking. This isolation allows you to design systems that remain operational even if an entire Region is affected.

Within each Region are Availability Zones (AZs). These are physically separate data centers connected by low-latency, high-throughput networking. From a networking perspective, AZs map to one or more Amazon VPC subnets, which allows architects to logically group resources and isolate failures. Because AZs are isolated from one another, applications can be distributed across multiple AZs to achieve high availability and fault tolerance.

Extending beyond Regions and AZs are edge locations, which form the outermost layer of AWS infrastructure. With more than 400 points of presence worldwide, these locations power services like Amazon CloudFront for content caching and AWS Global Accelerator for optimized network routing. They bring data closer to users, reducing latency.

AWS also offers Local Zones, which place compute, storage, and database services closer to large population centers. These are designed for latency-sensitive applications that require single-digit-millisecond response times, such as real-time gaming or media production. Similarly, AWS Wavelength Zones embed AWS compute and storage services within the data centers of 5G telecommunication providers, delivering ultra-low-latency applications directly to mobile edge devices and end users.

The postal system analogy

A helpful way to understand this structure is through a postal system analogy. Regions are like independent countries, each with its own postal network. Availability Zones are sorting facilities within a country. If one facility fails, others continue operating. Edge locations act like neighborhood mailboxes, bringing services closer to users. Local Zones are satellite offices in densely populated areas, enabling faster, near-instant delivery.

Currently, AWS infrastructure spans 39 Regions, 123 Availability Zones, 750+ edge locations, 43 Local Zones, and 33 Wavelength Zones.

Choosing the right combination of these layers is the first major architectural decision. Every other choice an architect makes, including networking design, data replication strategy, and governance model, builds on this foundation.

The following diagram illustrates how these layers relate to each other and how user traffic traverses them.

With this layered model established, the next step is understanding how architects evaluate and select specific Regions and AZs for workload placement.

Region and AZ design decisions

Selecting a Region is a multivariable optimization problem. Four primary factors drive the decision, and architects must weigh them against each other rather than optimizing for any single dimension.

Data residency and compliance dictate Region selection when regulations such as GDPRGDPR (General Data Protection Regulation) is a European Union law that governs how personal data of individuals in the EU is collected, stored, processed, and protected. require data to remain within specific geographic boundaries, making EU-based Regions mandatory regardless of latency or cost.
Proximity to end users reduces round-trip latency, which directly impacts user experience for interactive applications and API-driven workloads.
Service availability varies across Regions because AWS does not launch every service simultaneously in all Regions. Architects must verify that required services are available in the target Region before committing.
Cost differences exist between Regions, with some Regions (such as us-east-1) offering lower pricing for compute and storage than newer or more remote Regions.

Multi-AZ as the default resilience pattern

Once a Region is selected, deploying across at least two AZs is the baseline high-availability pattern. An Application Load Balancer distributes traffic across targets in multiple AZs. Amazon RDS Multi-AZ performs synchronous replication and automatic failover to a standby in a different AZ. Auto Scaling groups spread instances across AZs so that losing one AZ triggers replacement capacity in the surviving AZs.

Attention: Multi-AZ protects against failures at the Availability Zone level, but it does not address Region-wide disruptions. This distinction is critical. In scenarios where business continuity requirements demand low Recovery Time Objective (RTO) and Recovery Point Objective (RPO) across Regions, relying solely on Multi-AZ is not sufficient. You must incorporate multi-Region disaster recovery strategies.

Governance plays a key role in reinforcing these architectural decisions at scale. For example, Service Control Policies (SCPs)Organization-wide permission guardrails applied through AWS Organizations that can restrict which Regions, services, or actions are permitted across all accounts, preventing accidental or unauthorized workload deployment outside approved boundaries. within AWS Organizations can restrict which Regions teams are allowed to use. This helps enforce compliance requirements, maintain architectural consistency, and control costs by preventing the deployment of resources in unapproved Regions.

The following table compares multi-AZ and multi-Region design trade-offs to help architects match patterns to requirements.

Multi-AZ vs. Multi-Region Design Trade-offs

Design Aspect	Multi-AZ (Single Region)	Multi-Region
Failure Domain Scope	AZ-level isolation	Region-level isolation
Typical RTO/RPO	Minutes with automated failover	Minutes to hours depending on pattern
Data Replication	Synchronous (e.g., RDS Multi-AZ)	Asynchronous (e.g., S3 CRR, Aurora Global Database)
Complexity	Low to moderate	High due to cross-region networking and data consistency
Cost	Moderate with redundant AZ resources	High with duplicate infrastructure and cross-region data transfer
Compliance Alignment	Single-region data residency	Requires careful data routing across jurisdictions
Use Case Fit	Standard high availability workloads	Global applications, regulatory DR, and sovereignty requirements

When business requirements exceed what a single Region can guarantee, architects must evaluate multi-Region patterns, which are covered in the next section.

Multi-Region architecture patterns

Multi-Region design addresses three scenarios that single-Region deployments cannot satisfy: Region-level disaster recovery, global user distribution with low latency, and data sovereignty across multiple jurisdictions. The two primary patterns, active-passive and active-active, differ fundamentally in cost, complexity, and data consistency guarantees.

Active-passive deployments

In an active-passive configuration, a primary Region handles all production traffic, while a secondary Region maintains replicated data and standby infrastructure. Data replication relies on asynchronous mechanisms such as S3 Cross-Region Replication (CRR) or Aurora Global Database with read replicas in the secondary Region. Failover is orchestrated through Amazon Route 53 health checks configured with a failover routing policy. When the primary Region becomes unhealthy, Route 53 redirects DNS queries to the secondary Region’s endpoints.

The trade-off is clear: active-passive costs less because standby infrastructure can run at reduced capacity, but RTO is higher because the secondary Region may need to scale up compute, promote read replicas to writer instances, and warm caches before serving production traffic.

Active-active deployments

Active-active configurations serve live traffic from both Regions simultaneously. Route 53 latency-based routing or weighted routing distributes users to the nearest healthy Region. The primary challenge shifts from failover speed to data consistency.

DynamoDB Global Tables provide multi-Region, multi-active replication with last-writer-wins conflict resolution, making them suitable for workloads that tolerate eventual consistency across Regions.

Practical tip: A common SAP-C02 exam trap presents active-active as if both Regions accept writes for all services. In reality, architects must identify which services support multi-writer semantics (DynamoDB Global Tables) and which require write forwarding or single-writer patterns (Aurora Global Database and most relational stores).

Cross-Region connectivity and traffic steering

Cross-Region VPC connectivity should use Transit Gateway inter-Region peeringA managed, non-transitive peering connection between Transit Gateways in different Regions that routes traffic over the AWS private backbone, replacing the anti-pattern of creating a full mesh of individual VPC peering connections.. This hub-and-spoke model scales cleanly as the number of VPCs grows.

For user-facing traffic, AWS Global Accelerator provides static anycast IP addresses and performs health-check-based failover at the network layer within seconds, without DNS TTL dependencies. This is preferred over CloudFront when the requirement is TCP/UDP traffic steering rather than HTTP content caching.

Edge locations and Local Zones

Edge locations and Local Zones address different points on the latency spectrum, and selecting between them depends on whether the workload benefits from caching, network-layer routing, or proximity compute.

CloudFront and edge-tier content delivery

Amazon CloudFront caches static and dynamic content at 750+ points of presence (PoPs) and 15 Regional Edge Caches, reducing origin load and improving user-perceived latency for HTTP/HTTPS workloads. A CloudFront distribution can be configured with origin failover, where the distribution automatically switches to a secondary origin (potentially in another Region) if the primary origin returns 5xx errors or times out. This provides a lightweight multi-Region resilience pattern for content delivery without the full complexity of an active-passive deployment.

Global Accelerator vs. CloudFront

The architectural decision between CloudFront and Global Accelerator maps to a simple question: Does the workload benefit from caching, or does it need deterministic network-layer routing? CloudFront optimizes HTTP/HTTPS delivery with edge caching. Global Accelerator routes TCP and UDP traffic over the AWS backbone to the optimal regional endpoint, providing instant health-check-based failover and consistent network performance without caching.

Note:For the SAP-C02 exam, if a scenario mentions caching, static content, or HTTP acceleration, CloudFront is the answer. If the scenario mentions TCP/UDP traffic, gaming, VoIP, or sub-second regional failover, Global Accelerator is the answer.

Local Zones for ultra-low-latency compute

When even edge location latency is insufficient (real-time gaming, media rendering, AR/VR applications), Local Zones place EC2 instances and select services within a few milliseconds of end users in major metropolitan areas. Architects opt in to a Local Zone and extend a VPC subnet into it, allowing instances to run locally while management remains in the parent Region. The trade-off is a limited set of instance types and services compared to full AZs, so Local Zones complement rather than replace standard AZ deployments.

The following decision tree maps workload requirements to the appropriate infrastructure layer.

With workload placement and edge strategies defined, the final architectural consideration is how connectivity and governance tie these layers together at enterprise scale.

Connectivity and governance at scale

Global infrastructure decisions do not exist in isolation. They intersect with hybrid connectivity patterns and organizational governance, which determine whether an architecture is truly resilient or merely redundant.

AWS Transit Gateway acts as a regional hub for connecting VPCs and on-premises networks. Inter-Region Transit Gateway peering extends this hub-and-spoke model across Regions, replacing the anti-pattern of full-mesh VPC peering that becomes unmanageable beyond a few VPCs. For hybrid connectivity, a single AWS Direct Connect connection is not highly available. Architects should deploy redundant connections at geographically diverse Direct Connect locations or use Site-to-Site VPN as a backup path. A Direct Connect GatewayA globally available resource that enables a single Direct Connect connection to access VPCs across multiple Regions, simplifying hybrid network architecture without requiring separate connections per Region. enables connectivity to multiple VPCs across Regions.

Governance at scale requires AWS Organizations with Service Control Policies (SCPs) to restrict Region usage, enforce tagging standards, and prevent resource creation outside approved boundaries. The blast radius principle guides isolation strategy: segment workloads by account, Region, and AZ so that a failure or security incident in one boundary does not propagate. Centralized network inspection using AWS Network Firewall or Gateway Load Balancer in a shared services VPC must be designed to avoid asymmetric routing and single points of failure.

Practical tip: On the SAP-C02 exam, the correct answer always aligns the infrastructure layer with the specific requirement: latency, resilience, compliance, or cost. Choosing a complex multi-Region active-active pattern when a multi-AZ deployment meets the stated RTO/RPO is just as incorrect as selecting a single-AZ design when the scenario requires regional failover.

Summary

AWS global infrastructure is built on layered failure domains that architects choose based on workload needs: Availability Zones (AZs) provide high availability within a Region through isolation and automated failover, while multi-Region designs extend resilience using active-passive disaster recovery or active-active global architectures with added complexity and data consistency trade-offs. Edge locations improve content delivery, AWS Global Accelerator optimizes traffic with rapid failover, and Local Zones support ultra-low-latency workloads near users. Governance via AWS Organizations, SCPs, and blast radius isolation enforces controlled, compliant architecture at scale. For SAP-C02, always match the infrastructure layer to the requirement instead of defaulting to the simplest or most complex design.