Search⌘ K
AI Features

SageMaker Core Primitives

Explore the fundamental SageMaker core primitives that power scalable machine learning workflows. This lesson helps you understand how SageMaker handles training, processing, and hosting jobs through containerized, ephemeral compute. Gain insight into choosing the appropriate execution patterns and inference hosting options based on workload requirements. Develop a clear mental model of SageMaker’s container infrastructure, job types, and advanced hosting architectures to prepare for practical ML deployment and the AWS Certified Machine Learning Engineer exam.

For the AWS Certified Machine Learning Engineer – Associate exam, understanding how SageMaker orchestrates training, preprocessing, and inference as discrete, containerized jobs on ephemeral compute is foundational. Confusing these execution patterns is a common source of incorrect answers on the exam, so building a precise mental model of each primitive and its configuration surface is essential.

Training and processing tasks in SageMaker run inside a Docker container on compute instances that SageMaker provisions and terminates automatically. Engineers specify the container image, instance type, and input and output channels, and SageMaker handles the rest. This design means that managing clusters or patching operating systems is not your responsibility.

The three core job types form the backbone of SageMaker’s execution model.

  • Training jobs execute model training by pulling data from S3, running training code in a container, and writing model artifacts back to S3.

  • Processing jobs handle data preprocessing, feature engineering, and model evaluation as standalone, containerized tasks.

  • Hosting jobs serve trained models for inference through persistent endpoints, batch operations, or queue-based architectures.

Each job type accepts distinct configuration parameters, including instance types, S3 input and output channels, and container image URIs stored in Amazon ECR. Selecting the correct job type and hosting pattern based on workload characteristics is a frequently tested exam skill.

Training and processing jobs

Training and processing jobs form the core execution mechanisms in SageMaker, enabling scalable model training and data transformation workflows on managed infrastructure.

How are training jobs executed?

A SageMaker training job follows a structured execution sequence. SageMaker first provisions compute resources, then consumes input data through defined channels that can reference Amazon S3, Amazon EFS, or Amazon FSx for Lustre. Depending on the configured input mode, data is either downloaded to local storage or streamed ...