Compute Strategy Design

Explore how to design an effective compute strategy in AWS by evaluating execution models such as EC2, containers, and serverless. Understand workload mapping, performance needs, and cost considerations to select the best compute option. Learn to optimize instance families, adopt Graviton processors, and apply high-performance computing patterns including cluster placement groups and Elastic Fabric Adapter for low-latency applications. This lesson helps you develop architectural decision-making skills to deploy scalable and efficient AWS compute environments.

We'll cover the following...

Compute as an architectural decision
Matching execution models to workloads
EC2 instance family selection and optimization
High-performance compute design
- Cluster placement groups and network interfaces
  - EFA vs. ENA
- ParallelCluster and shared storage
Cost and performance alignment
- Purchase models
- Scaling strategies and resilience
Conclusion

In AWS solution architecture at the professional level, compute selection is never a checkbox decision. It is an architectural choice that influences every layer of a system, including networking design, storage integration, scaling behavior, cost structure, and operational responsibility. Scenario-based questions evaluate whether an architect can reason about these trade-offs in context rather than defaulting to the most managed option. This lesson establishes a structured framework for compute decisions, moving from execution models to workload mapping, performance considerations, and cost alignment.

Compute as an architectural decision

Choosing between EC2, containers, and serverless is a trade-off between control and operational overhead. Amazon EC2 provides full OS-level control, custom kernel tuning, and support for specialized hardware such as GPUs and custom networking, but requires management of patching, scaling, and capacity planning. Containers on services like Amazon ECS and Amazon EKS offer portability and orchestration benefits, but introduce cluster and scheduling complexity. Serverless options like AWS Lambda and AWS Fargate remove infrastructure management entirely, but impose constraints such as execution limits, runtime boundaries, and event-driven design requirements.

The key architectural principle is constraint validation before optimization. Workloads that require kernel-level tuning, dedicated hardware access, or specialized licensing cannot be moved to serverless purely for simplicity. Conversely, event-driven and unpredictable workloads gain no benefit from EC2 fleet management.

Matching execution models to workloads

Selecting the right execution model requires aligning workload characteristics with the capabilities and constraints of each compute option. Three primary models cover the decision space.

EC2 with Auto Scaling groups

EC2 is the correct choice when workloads require stateful processing with persistent local storage, OS-level customization such as custom kernel modules or system libraries, GPU or FPGA acceleration for machine learning training or genomics, licensing models tied to physical hosts or sockets through Dedicated HostsEC2 instances running on physical servers exclusively allocated to a single AWS account, enabling BYOL licensing that requires per-socket or per-core visibility, long-running processes exceeding Lambda's 15-minute limit, or tightly coupled network communication such as MPI-based HPC that requires deterministic latency.

Containers on ECS or EKS

Containers fit portable ...