Hub‑and‑Spoke MLOps: Multi‑Account Deployment Strategies
Explore how to operationalize machine learning across multiple AWS accounts using the hub-and-spoke model. Understand the roles of the hub and spoke accounts, implement secure cross-account resource sharing with AWS RAM, and enforce governance through IAM and service control policies. Learn to automate model promotion from development to production while maintaining compliance and operational resilience.
We'll cover the following...
Enterprise ML teams inevitably outgrow a single AWS account. What begins as a convenient shared workspace becomes a liability as teams scale. The limitations are concrete and compounding.
Permission sprawl emerges first. When data scientists, ML engineers, QA teams, and SREs share one account, IAM policies become increasingly permissive to avoid blocking workflows. A single overly broad role can grant training job access alongside endpoint deployment privileges, violating the principle of least privilege. Blast radius is the second concern: a misconfigured training job that exhausts compute quotas or a compromised credential affects every workload in the account. Third, audit complexity makes compliance painful. CloudTrail logs from dozens of teams interleave, making it nearly impossible to trace a model's journey from experiment to production.
The solution is architectural separation using three core AWS services working in concert:
SageMaker Model Registry: Centralized catalog for versioned model packages with built-in approval workflows, serving as the single source of truth for what is production-ready.
AWS Resource Access Manager (RAM): A cross-account sharing mechanism that grants spoke accounts access to hub resources without duplicating artifacts or creating brittle bucket policies.
AWS Organizations: Account structure and governance layer enabling service control policies that enforce boundaries at the organizational level.
Hub-and-spoke topology
The hub-and-spoke pattern is the industry-standard approach. A central hub account owns governance: the Model Registry, shared artifact storage, and event routing. Spoke accounts (development, staging, production) operate in isolation, each with a clearly scoped responsibility. Business drivers reinforce this: regulated industries require environment separation for SOC 2 and HIPAA compliance, platform teams need operational resilience, and leadership demands clear ownership boundaries.
This architecture creates a controlled promotion pipeline: models flow from development spokes into the hub registry, pass through approval gates, and become available to staging and production spokes. Models never flow through direct spoke-to-spoke access.
The following diagram illustrates how these components connect across account boundaries.
With the topology visualized, ...