Search⌘ K
AI Features

Data Security and Governance

Explore how to implement data security and governance for machine learning on AWS. Understand encryption with AWS KMS, centralized access controls using Lake Formation, and automated sensitive data detection with Amazon Macie. Learn to protect ML datasets from unauthorized access while maintaining compliance and operational efficiency.

ML systems routinely ingest large volumes of sensitive data, from personally identifiable information and financial records to protected health data, which is typically centralized in Amazon S3-based data lakes. A single misconfigured bucket policy or an unencrypted training dataset can expose an organization to compliance violations under GDPRA European Union regulation that governs how organizations collect, process, and protect the personal data of EU residents, emphasizing privacy rights and strict data protection requirements. or HIPAAA US regulation that establishes standards for protecting sensitive patient health information and ensuring secure handling of protected health data., unauthorized access, and costly data breaches. For the AWS Certified Machine Learning Engineer – Associate exam, it is important to understand three core services that form a layered defense for ML datasets. AWS KMS provides key management for encryption at rest, while encryption in transit is handled through TLS and, for some SageMaker workloads, optional service-level encryption features. AWS Lake Formation provides centralized, fine-grained access control over data lakes registered in the AWS Glue Data Catalog. Amazon Macie automates the discovery of sensitive data in Amazon S3, which you can use as a precheck before datasets are consumed by ML workflows.

Together, these services implement a governance strategy that protects ML datasets across ingestion, storage, and training.

Encrypting ML datasets with AWS KMS

ML datasets stored in Amazon S3, attached to Amazon EBS volumes, or consumed by SageMaker jobs should be encrypted at rest as a best practice. AWS Key Management Service (KMS)A fully managed service that creates and controls cryptographic keys used to encrypt data across AWS services. integrates ...