ipasis
Blog/Engineering

Redis Rate Limiting Architectures: Sliding Window Log vs. Token Bucket

December 13, 20258 min read

Rate limiting is not merely a mechanism to prevent server overload; it is a fundamental component of API security and billing integrity. While fixed window counters are simple to implement, they are vulnerable to boundary attacks (e.g., a burst of traffic at the 59th second and another at the 00th second effectively doubling the limit).

For enterprise-grade applications, two patterns dominate: the Token Bucket and the Sliding Window Log. This guide analyzes their implementation in Redis, focusing on atomicity, space complexity, and latency.

1. The Token Bucket Algorithm

The Token Bucket algorithm relies on the analogy of a bucket being refilled with tokens at a constant rate. Each request consumes a token. If the bucket is empty, the request is rejected.

Mechanism

  1. Bucket Size (B): Maximum burst capacity.
  2. Refill Rate (R): Tokens added per second.

Redis Implementation

To ensure atomicity without using heavy locking mechanisms, Lua scripting is required. A GET followed by a SET in application logic introduces race conditions in distributed environments.

Lua Script (Atomic):

-- keys[1]: The rate limit key (e.g., "lim:user:123")
-- argv[1]: Capacity (Burst size)
-- argv[2]: Refill rate (Tokens/sec)
-- argv[3]: Current timestamp
-- argv[4]: Requested tokens (usually 1)

local key = KEYS[1]
local capacity = tonumber(ARGV[1])
local rate = tonumber(ARGV[2])
local now = tonumber(ARGV[3])
local requested = tonumber(ARGV[4])

local last_tokens = tonumber(redis.call("hget", key, "tokens"))
local last_refill = tonumber(redis.call("hget", key, "last_refill"))

if last_tokens == nil then
    last_tokens = capacity
    last_refill = now
end

-- Calculate refill
local delta = math.max(0, now - last_refill)
local filled_tokens = math.min(capacity, last_tokens + (delta * rate))

local allowed = false
if filled_tokens >= requested then
    filled_tokens = filled_tokens - requested
    allowed = true
end

redis.call("hset", key, "tokens", filled_tokens, "last_refill", now)
redis.call("expire", key, math.ceil(capacity/rate)) -- Cleanup

return allowed

Pros & Cons

  • Pros: O(1) space complexity. Memory usage is constant regardless of traffic volume.
  • Cons: Lower precision for strict rolling windows. It allows bursts, which is desirable for infrastructure protection but problematic for strict API billing.

2. The Sliding Window Log

For scenarios requiring strict precision (e.g., "exactly 100 requests in the last rolling 60 seconds"), the Sliding Window Log is superior. It tracks the timestamp of every request.

Mechanism

  1. Use a Redis Sorted Set (ZSET).
  2. Score = Timestamp (Unix epoch or microsecond precision).
  3. Value = Unique Request ID (UUID) or Timestamp (if collisions are handled).

Redis Implementation (Pipeline)

We use a Redis transaction (MULTI/EXEC) to perform the logic atomically.

Python Example:

import time
import redis

r = redis.Redis(host='localhost', port=6379, db=0)

def is_rate_limited(user_id, limit, window_seconds):
    key = f"rate_limit:{user_id}"
    now = time.time()
    window_start = now - window_seconds

    pipeline = r.pipeline()
    pipeline.multi()
    
    # 1. Remove requests older than the window
    pipeline.zremrangebyscore(key, 0, window_start)
    
    # 2. Add current request
    pipeline.zadd(key, {str(now): now})
    
    # 3. Count requests in current window
    pipeline.zcard(key)
    
    # 4. Set expiry to auto-clean unused keys
    pipeline.expire(key, window_seconds + 1)
    
    results = pipeline.execute()
    
    # results[2] is the ZCARD result
    request_count = results[2]
    
    return request_count > limit

Pros & Cons

  • Pros: 100% precision. Handles the "rolling window" requirement perfectly.
  • Cons: O(N) space complexity, where N is the rate limit. High memory footprint for high-throughput APIs. ZREMRANGEBYSCORE can become expensive on large sets.

Comparative Analysis

| Feature | Token Bucket | Sliding Window Log | | :--- | :--- | :--- | | Memory Footprint | Low (Constant) | High (Linear to limit) | | Precision | Good (Allows bursts) | Perfect (Strict window) | | Redis Commands | Lua Script | ZSET Pipeline | | Best Use Case | DDoS Protection, Public API Throttling | Billing logic, strict SLA enforcement |

FAQ

Q: Why not use the Fixed Window Counter? A: Fixed windows suffer from the "thundering herd" or boundary problem. A user can exhaust their limit at the end of minute 1 and immediately exhaust it again at the start of minute 2, resulting in 2x throughput in a short span.

Q: How does Redis Cluster affect these patterns? A: Both methods operate on a single key. Ensure your keys utilize hash tags (e.g., {user:123}:limit) so that all data required for the atomic operation resides on the same hash slot.

Q: Which should I choose for a public API? A: Start with Token Bucket. It is memory efficient and forgiving of bursts, which improves user experience. Use Sliding Window only if your business model strictly mandates it.

Secure Your Perimeter with IPASIS

Rate limiting manages the volume of traffic, but it does not filter the intent. Sophisticated attackers use rotating residential proxies to bypass IP-based rate limits by distributing requests across thousands of IPs.

IPASIS provides real-time IP intelligence to detect VPNs, proxies, and compromised nodes before they even hit your rate limiter.

Get your API Key and harden your infrastructure today.

Start detecting VPNs and Bots today.

Identify anonymized traffic instantly with IPASIS.

Get API Key