Deployment and Orchestration of ML Workflows

Explore practical deployment and orchestration strategies for AWS SageMaker ML workflows. Understand endpoint types for different inference needs, implement CI/CD pipelines with blue/green deployments, automate retraining with event triggers, and choose the right tools to minimize latency, cost, and operational complexity in production environments.

We'll cover the following...

Question 36
Question 37
Question 38
Question 39
Question 40
Question 41
Question 42
Question 43
Question 44
Question 45
Question 46
Question 47
Question 48
Inquiry 49

Question 36

A company operates a real-time fraud detection model that receives sporadic traffic with long idle periods. Sometimes, hours pass without a single request. When invoked, the model must respond with sub-second latency. The team wants to minimize costs while maintaining acceptable response times.

Which SageMaker endpoint type should the team use?

A. Deploy the model on a SageMaker real-time endpoint with a single ml.m5.large instance to guarantee consistently low latency at all times.

B. Deploy the model on a SageMaker Serverless Inference endpoint, accepting potential cold-start latency in exchange for automatic scale-to-zero during idle periods.

C. Deploy the model on a SageMaker Asynchronous Inference endpoint to queue requests and process them when capacity is available.

D. Deploy the model on a SageMaker Batch Transform job triggered by each incoming request to avoid maintaining any persistent infrastructure.

Question 37

An ML team needs to deploy a computer vision model that processes high-resolution medical images. Each image can be up to 500 MB in size, and processing takes approximately three minutes per image. Clinicians submit individual images on demand throughout the day, and results do not need to be returned synchronously. They can be retrieved from storage once processing completes.

Which SageMaker inference option should the team use?

A. Deploy the model on a SageMaker real-time endpoint and increase the endpoint timeout to accommodate the three-minute processing time.

B. Deploy the model using SageMaker Batch Transform, triggered each time a new image is uploaded to Amazon S3.

C. Deploy the model on a SageMaker Asynchronous Inference endpoint, which supports payloads up to 1 GB and processing times up to 15 minutes.

D. Deploy the model on a SageMaker Serverless Inference endpoint to minimize costs during periods of low image submission volume.

1.Introduction and Exam Strategy

2.AWS Core Services for MLA-C01

Cloud Lab

Cloud Lab

Cloud Lab

3.Machine Learning Foundations for AWS Engineer

4.SageMaker and Secure ML Environments

5.Data Ingestion and Storage Architectures

Cloud Lab

Cloud Lab

6.Data Transformation and Feature Engineering

Cloud Lab

Cloud Lab

Cloud Lab

Cloud Lab

Cloud Lab

7.Data Quality, Labelling, and Governance

Cloud Lab

Cloud Lab

8.Managed AI and Generative AI Solutions

Cloud Lab

Cloud Lab

Cloud Lab

Cloud Lab

9.Model Development, Optimisation, and Management

Cloud Lab

10.Deployment, Inference, and Orchestration

Cloud Lab

Cloud Lab

Cloud Lab

Cloud Lab

11.Monitoring and Cost Optimisation

12.Conclusion

Assessment

13.Practice Exam Solution - AWS Certified Machine Learning Engineer

14.Free AWS Certified Machine Learning Engineer Associate Practice

Deployment and Orchestration of ML Workflows

Question 36

Question 37

Question 38