ML Solution Monitoring, Maintenance, and Security
Explore techniques to proactively monitor and maintain machine learning models deployed on AWS SageMaker. Understand how to detect data distribution drift, manage security and access controls, ensure compliance with audit requirements, and optimize operational costs for real-time ML endpoints.
Question 50
A company has deployed a real-time fraud detection model on a SageMaker endpoint. After several weeks in production, the data science team notices that the distribution of categorical feature values in incoming requests has shifted significantly compared to the training data. However, model accuracy metrics have not yet degraded. The team wants to proactively detect and alert on this kind of input data distribution shift before it impacts predictions.
Which approach should the team implement to meet this requirement?
A. Configure a SageMaker Clarify bias monitoring schedule on the endpoint to detect changes in categorical feature distributions and set up CloudWatch alarms for threshold breaches.
B. Configure a SageMaker Model Monitor data quality monitoring schedule with baseline constraints and statistics generated from the training data, and integrate it with CloudWatch alarms to alert when feature distribution drift exceeds defined thresholds.
C. Set up an AWS Lambda function that periodically invokes the endpoint with a held-out test dataset, compares the output predictions against expected values, and publishes custom metrics to CloudWatch.
D. Wait until model accuracy metrics such as precision and recall degrade below acceptable thresholds, then trigger a retraining pipeline using Amazon EventBridge.
Question 51
A health care company deploys a SageMaker real-time endpoint for patient risk scoring. Regulatory requirements mandate that all API calls to SageMaker, including the identity of who invoked the endpoint and the timestamp, must be logged and auditable. Additionally, the operations team must monitor inference latency and receive alerts when the p99 latency exceeds 500 ms.
Which services should the team use to meet these requirements? (Select any two options.)
A. AWS CloudTrail to log and audit all SageMaker API calls, including caller identity and timestamps.
B. AWS X-Ray to trace all endpoint invocations and capture the identity of each caller for audit purposes.
C. Amazon CloudWatch to monitor ModelLatency metrics and configure alarms when p99 latency exceeds the 500 ms threshold.
D. SageMaker Model Monitor to track inference latency metrics and generate alerts when latency thresholds are breached.
E. Amazon QuickSight to build a real-time dashboard that visualizes API call audit logs and latency metrics with alerting capabilities.
Question 52
An ML team has a SageMaker endpoint running on ml.c5.4xlarge instances for a computer vision classification model built with TensorFlow. They want to reduce the inference cost per prediction while maintaining or improving latency. The model is a standard image classification architecture.
Which approach should the team take to achieve the most cost-effective inference?
A. Switch the endpoint instances from ml.c5.4xlarge to ml.p3.2xlarge GPU instances to increase inference throughput.
B. Compile the TensorFlow model using SageMaker Neo for AWS Inferentia and deploy the optimized model on ml.inf1 instances.
C. Enable Application Auto Scaling on the existing ml.c5.4xlarge endpoint to dynamically adjust instance count based on traffic patterns.
D. Migrate the endpoint to SageMaker Serverless Inference to eliminate idle instance costs and pay only per inference request.