Performance Optimization for Compute

Explore how to optimize AWS compute performance by understanding Auto Scaling policies, placement groups, and fault isolation strategies. Learn to balance throughput, latency, and availability to design resilient, scalable cloud systems that meet diverse workload demands.

We'll cover the following...

Compute performance as an architecture discipline
EC2 Auto Scaling strategies
- Scaling policy types
- Scaling signals and health models
Placement groups and hardware optimization
Fault isolation through distributed compute
- Blast-radius engineering through distribution
Balancing performance and availability

At the AWS Solutions Architect Professional level, compute performance optimization is not a matter of selecting the largest instance type available. It is a multidimensional architectural discipline in which scaling behavior, physical hardware placement, and fault-isolation strategies interact to determine whether a system meets its throughput, latency, and availability targets simultaneously. The exam tests your ability to reason across these dimensions under realistic constraints, selecting answers that balance competing requirements rather than optimizing a single metric in isolation.

Compute performance as an architecture discipline

The AWS Well-Architected Framework’s Performance Efficiency pillar treats compute optimization as a continuous process of matching resources to workload characteristics. Three architectural pillars define this discipline. First, EC2 Auto Scaling strategies govern how compute capacity adapts elastically to demand. Second, placement groups give architects control over how instances are physically positioned on underlying hardware, which directly affects network latency and failure correlation. Third, distribution patterns across Availability Zones, racks, and instance families reduce blast radius and help prevent correlated failures from cascading into outages.

These pillars are not independent. A cluster placement group delivers the lowest network latency for HPC workloads, but it concentrates all instances on a single rack, creating a large failure domain that Auto Scaling alone cannot mitigate. Conversely, spreading instances across multiple Availability Zones improves fault tolerance but introduces cross-AZ latency that may be unacceptable for tightly coupled compute. Services such as EC2, Auto Scaling groups, and enhanced networking through ENA (Elastic Network Adapter)The default high-performance network interface for EC2 instances supporting up to 100 Gbps bandwidth within a VPC and EFA (Elastic Fabric Adapter)A network device attachable to EC2 instances that enables HPC applications to use OS-bypass hardware interfaces for low-latency, high-throughput inter-node communication form the foundation. Every architectural decision in this domain directly affects availability, cost, and operational complexity, and the exam expects you to evaluate all three simultaneously.

With this framing established, the next section examines how Auto Scaling adapts compute capacity to demand and why the choice of scaling signal matters as much as the scaling mechanism itself.

EC2 Auto Scaling strategies

...