FM Deployment with SageMaker AI

Understand how to deploy foundation models for generative AI using Amazon SageMaker AI. Learn to choose and configure the right endpoint types—real-time, asynchronous, or serverless—to balance latency, cost, and workload demands in production scenarios. Gain insight into managing large models with predictable performance and scalable inference.

We'll cover the following...

SageMaker AI for generative workloads
SageMaker AI endpoints
Comparing SageMaker AI endpoint types

SageMaker AI is typically introduced when generative AI workloads require more control than fully managed or on-demand inference options can provide. This often happens when models are large, customized, or expected to serve production traffic with predictable performance characteristics. In these scenarios, inference behavior depends on both the model’s capabilities and how infrastructure, memory, and execution time are managed.

Generative AI systems tend to surface these needs quickly. Large language models have long initialization times, high memory footprints, and token-based execution patterns that make cold starts and opaque scaling behavior unacceptable. When an application requires consistent latency, sustained throughput, or the ability to process very large requests, SageMaker AI becomes the natural choice for hosting inference.

SageMaker AI for generative workloads

...

1.Introduction

2.AWS Core Services for AIP Exam

Breakout Session

3.Generative AI Fundamentals

4.Introducing Amazon Bedrock

Cloud Lab

5.Data Engineering and Retrieval-Augmented Generation (RAG)

Cloud Lab

Cloud Lab

6.Agentic AI Systems

Cloud Lab

Cloud Lab

Cloud Lab

Cloud Lab

Cloud Lab

Mock Interview

Cloud Lab

7. Model Deployment with SageMaker AI

Cloud Lab

Cloud Lab

8.AI Safety and Content Moderation

Cloud Lab

Cloud Lab

9.AI Governance and Compliance

10.Operational Efficiency for AI Systems

11.Model Evaluation and Troubleshooting

Cloud Lab

Cloud Lab

12.Conclusion

Assessment

13.Practice Exam Solution: AWS Certified GenAI Developer

14.Free AWS Certified Generative AI Developer Practice Exam

FM Deployment with SageMaker AI

SageMaker AI for generative workloads