Trusted answers to developer questions
Trusted Answers to Developer Questions

Related Tags

cloud

What is the cloud service availability metric?

Furqan Athar

Grokking Modern System Design Interview for Engineers & Managers

Ace your System Design Interview and take your career to the next level. Learn to handle the design of applications like Netflix, Quora, Facebook, Uber, and many more in a 45-min interview. Learn the RESHADED framework for architecting web-scale applications by determining requirements, constraints, and assumptions before diving into a step-by-step design process.

Some days, we notice that the network doesn’t work at its best and that there could be shortages. This is because the network provider doesn’t provide 100% uptime. So, to measure the availability of the cloud service providers, they have a metric used to measure availability, called nines.

Formula

Explanation

  • Guaranteedtime: This is the expected time that the service provider guarantees to provide. In their service level agreement (SLA), if the service provider specifies that the service will be available from 6 AM to 6 PM, then the guaranteed/agreed time is 12 hours per day.

  • Downtime: This is the time when the service is unavailable during the guaranteed/agreed service time.

  • Availability: This is the percentage of time the service is available.

The availability rate matters a lot in cloud computing services. A low-level service may provide an uptime of “two nines” (99%), while a high-level service may provide an uptime of “four nines” (99.99%). This means that 99.99% of the time, for the high-level service, we are guaranteed to have a stable service, but there’s a 0.01% probability of downtime. The goal for many cloud providers is “five nines” (99.999%), and even that still has some downtime.

Availability nines and the downtimes

The following tables show how the downtime in a year is affected when there is a change in the service availability time.

The service level agreement calculations assume a continuous uptime of 24/7 all year.

Availability and downtime

PERCENTAGE

DOWNTIME IN A YEAR

99.9

8hr 45m 57s

99.99

52m 35.7s

99.999

5m 15.6s

99.9999

31.6s

99.99999

3.2s

Reasons for losing a nine in uptime

A simple and static application can easily achieve an uptime of four nines. However, as the application grows and the database and other components get bigger, the application becomes complex, and so the risk of losing a nine in availability metric increases as well.

Effect of losing a nine

Consider that a service provider mentions an availability time of 99.9% in its SLA, but an e-commerce giant using this service faces a downtime of 1 hour in a day instead of the expected 1 minute and 26 seconds. The effects of this can be dire in terms of revenue loss.

Big cloud giants like Google, Amazon, and Microsoft have an SLA of at least three nines (99.9%) and a maximum of four nines (99.99%) availability. So, when someone refers to a service’s high availability, they refer to it.

Improving availability measures

To meet customers’ satisfaction, a service provider should carefully craft its SLA.

Communicate with the customer

Before providing service to a customer, a service provider should communicate with the customer about their requirements, peak traffic hours, and other things that would be affected by downtime.

Measure availability over the agreed timeframe

A service provider should measure the availability of its service over the agreed timeframe a customer wants. A high availability rate during the customer’s off-peak hours would not benefit them. Thus, a service provider should try to analyze the high uptime of their service and communicate it clearly with their customers.

RELATED TAGS

cloud

CONTRIBUTOR

Furqan Athar
Copyright ©2022 Educative, Inc. All rights reserved

Grokking Modern System Design Interview for Engineers & Managers

Ace your System Design Interview and take your career to the next level. Learn to handle the design of applications like Netflix, Quora, Facebook, Uber, and many more in a 45-min interview. Learn the RESHADED framework for architecting web-scale applications by determining requirements, constraints, and assumptions before diving into a step-by-step design process.

Keep Exploring