Reliability on the Cloud
Explore the foundational concepts of reliability in cloud architecture. Understand how to design systems that recover from failures automatically, scale on demand, and manage changes with automation. This lesson covers essential strategies to monitor, test, and maintain reliable cloud services, helping you manage disruptions and maintain performance efficiently.
We'll cover the following...
We'll cover the following...
- Design principles: The five design principles for reliability on the cloud
- Test recovery procedures:
- Automatically recover from failure:
- Scale horizontally to increase aggregate system availability:
- Stop guessing capacity:
- Manage change in automation:
- Definition
- Best practices foundations
- Change management
- Failure management
- Key services
- Foundations:
- Change management:
- Failure management:
The reliability pillar includes a system’s ability to recover from ...