Uptime and Availability
Explore how uptime and availability metrics provide insight into the reliability of APIs from a user perspective. Understand the significance of errors per minute and latency in assessing API performance and how these metrics impact customer experience and SLA compliance.
We'll cover the following...
Although customer applications depend on the performance of our APIs, the choices and nuances of our infrastructure choices that make our APIs performant are abstracted from our customers. Uptime and availability are the most outward-facing of all infrastructure metrics. Most companies publish a status page with uptimes and availability stats of their APIs. These pages are generated using a ping service.
The availability of a service is the probability of the system being available to the user over a period of time. Since availability is the likelihood of the system being available, it helps us understand the reliability of our APIs at a wider scale and how it might impact the user experience.
Uptime measures the reliability of an API as a percentage of time the service has been working and is ready for use. Uptime is used as the go-to standard for measuring the availability of APIs. In the following screenshot, we can see how Stripe displays the uptime of its APIs over a 90-day period on its status page.
In the preceding screenshot, we can see that the uptime stats of Stripe’s APIs also display partial degradations as yellow bars in the middle of mostly green bars representing 100% uptime. When new customers review Stripe APIs as a possible solution, this level of transparency in reporting uptime inspires confidence that the service is reliable. For customers who already use Stripe, ...