Uber API Design Evaluation and Latency Budget

Learn how we meet the non-functional requirements and estimate the response time of the Uber APIs.

Introduction

We've discussed the design considerations and API model for the functional requirements of our Uber service. In this lesson, we'll cover different approaches to achieve our non-functional requirements and estimate the response time of our Uber services.

Non-functional requirements

The non-functional requirements for our APIs are availability, low latency, scalability, and security. Let's understand how we can achieve our requirements.

Availability

We ensure the availability of the services by decoupling the services. For example, the driver service continuously works and keeps the up-to-date location of drivers if the rider service goes down temporarily. The availability of our services also depends on the supporting services. For example, if Google Maps is unavailable, we have other alternate maps services (such as MapQuest and Waze), although these services may not support all the features that Google Maps offers. Our service supports multiple payment methods. If one payment service is down, then we have other payment methods, such as Uber balance or simply paying in cash. Moreover, we prioritize requests such as ongoing trips, payments, etc., and use rate limiting at the API gateway for other requests to prevent the services from being overloaded by the user's requests. If the service is down or overloaded by the requests, multiple replicas of the services are utilized to ensure availability.

Note: Uber may also facilitate integration with local payment gateways depending on the region of the service. Therefore, the Uber service will not go down because of the supporting service of payment gateways.

Scalability

Since most of the communication in our design happens through the pub-sub service, we created multiple replicas of the service to avoid SPOF. We can use different instances of pub-sub service for an unrelated combination of riders/drivers to distribute the load. This also allows us to decouple services and enhance the scalability of the API. The stateless nature of the request for the services allows us to replicate the services to forward the requests to any available server. However, the scalability of our services also depends on the scalability of the supporting services. For example, if any supporting service has scaling issues, it could become a bottleneck for our service. Services like Google Maps are highly scalable and use CDNs to reduce the risk of disruption by serving static data specific to a customer's region, so there’s little chance that such supporting services will affect our service.

Level up your interview prep. Join Educative to access 70+ hands-on prep courses.