Rate-limiter References

Where to put rate-limiter?

Let’s assume that we have a cluster consisting of a number of servers. To keep a check on the number of requests to this cluster, there are two ways to place the rate limiter in this cluster.

  1. Global rate limiter: The global rate limiter provides rate-limiting for the entire cluster of multiple nodes. This approach has a global counter shared by all the requests coming in. For example, in the token bucket algorithm, all requests can share the total number of tokens in a single bucket. In this approach, if each node were to track its rate limit, a client could exceed a global rate limit when sending requests to different nodes. The greater the number of nodes, the more likely the user will exceed the global limit. To enforce the limit, we must set up sticky sessions in the load balancer to send each consumer to exactly one node.

  2. Local rate limiter: This design can have different rate limiters for different API servers. This way, the rate limiter can provide different throttling limits for different APIs for each client (ID or key). Consequently, there will be a separate cache for each APIs consisting of rules corresponding to each API server.

Create a free account to access the full course.

By signing up, you agree to Educative's Terms of Service and Privacy Policy