Search⌘ K
AI Features

Compute Foundations for ML Workloads

Understand how to choose appropriate EC2 instance types for different ML tasks, apply cost-effective pricing strategies like Spot and Savings Plans, and leverage containerization with AWS services. This lesson helps you optimize training and deployment performance while managing costs and operational complexity in AWS ML workflows.

Every ML workflow depends on compute, from training a billion-parameter transformer to serving sub-100 ms predictions for a fraud-detection API. The wrong instance choice can mean 10× higher costs or hours of wasted training time. This lesson covers three compute dimensions that the exam expects you to understand: EC2 instance type selection for ML performance, on-demand vs. provisioned resource strategies for cost control, and containerization with AWS container services for deployment flexibility. Amazon SageMaker abstracts much of this infrastructure, but exam scenarios regularly ask you to optimize training duration, reduce inference latency, or minimize monthly spend. Understanding what runs beneath SageMaker’s managed surface is the difference between a passing score and guesswork.

EC2 instance types for ML workloads

Selecting the right EC2 instance family is a modeling and deployment decision that directly affects training throughput, inference latency, and cost. SageMaker exposes these same families as ml. instance types, so the logic applies whether you launch raw EC2 instances or configure a SageMaker training job.

Instance families and their ML roles

Each instance family is engineered for a different hardware profile. The key families relevant to ML workloads break down as follows:

  • General purpose (M5/M6i): These instances offer balanced vCPU-to-memory ratios, making them suitable for notebook experimentation, lightweight preprocessing, and prototyping where neither CPU nor memory is the bottleneck.

  • Compute optimized (C5/C6i): Built around high-performance Intel or AMD processors, these instances accelerate CPU-bound tasks such as feature-engineering pipelines, traditional ML algorithms such as XGBoost or random forests, and numerical transformations that do not benefit from GPU acceleration.

  • Memory optimized ...