Nicholas Alvarez

System Design

😌 Rate Limiting Made Easy: 5 Common Algorithms

Token Bucket

Nicholas Alvarez
Nicholas Alvarez
April 1, 2026
This 3D infographic illustrates the Token Bucket algorithm, showing tokens (gold coins) refilling a bucket at a set rate to manage incoming data requests.

Preface

To help establish an early foundation, it is important to understand these 5 common algorithms often seen in rate limiting.

If you are not familiar with what a rate limiter is, check out this "Easy" article I previously wrote: Rate Limiters Made Easy

This is part of my series on learning how to pass system design interviews.

Here are the 5 Most Commonly Seen Rate Limiting Algorithms in Real Production Environments

  1. Token Bucket - Learn
  2. Leaking Bucket - Learn
  3. Fixed Window Counter - Learn
  4. Sliding Window Log - Learn
  5. Sliding Window Counter - Learn

Token Bucket

Think of a bucket with 4 coins inside. Above the bucket is a conveyor belt of coins ready to be inserted into the bucket. Think 4 coins within, 2 replacement coins above.

This 3D infographic illustrates the Token Bucket algorithm, showing tokens (gold coins) refilling a bucket at a set rate to manage incoming data requests. It visualizes how requests are either allowed by consuming a token or denied/throttled if the bucket is empty.

If the bucket is full, these replacement coins above the bucket "overflow" and fall off to the side. If the bucket is partially empty, these coins replace any missing coins.

Every time a request is made, a slot opens and one coin is removed. For example, when a person sends a message via chat, each message costs one coin. When you send a message, the algorithm checks if there are enough coins in the bucket. If there are, you can send messages. If you spam too many messages too quickly, you won't have enough coins in the bucket, and your request will be stopped until the bucket refills.

There are two parameters for this algorithm:

  • Bucket size (4 coins)
  • Refill rate (2 coins replaced every second)

In real production environments, you should determine how many "buckets" you need for different API endpoints. You could have separate buckets for sending daily friend requests, liking friends posts, and creating posts per day. You could also add buckets for throttling requests based on user's IP addresses. Another strategy is to have a "global" bucket, where the max requests per second is 10,000 for a combination of all the above factors.

Conclusion

In conclusion, a token bucket has a certain amount of coins, replacement coins, and if there aren't enough coins because you sent too many requests, the request will fail until the bucket refills.

Token buckets allow for short bursts of activity, are memory efficient, and simple to add to your applications. The only difficult part is fine-tuning the total coins and the replacement coins for each use case.

For the sake of time and proper learning retention, I will discuss the rest of the algorithms in future blogs.

In my next blog, I will discuss the other 4 most common rate limiting algorithms.

Summary

Thank you for reading my blog post!

To continue learning the fundamentals of System Design, the next important fundamental to learn is understanding...

Make sure to check out the additional blogs here for materials to help you throughout your learning journeys!

Credit: ByteByteGo - Design a Rate Limiter