dev_to 2026年3月21日

API のレート制限：トークンバケトルとスライドウィンドウ、Redis による実装

Rate Limiting Your API: Token Bucket, Sliding Window, and Redis

Translated: 2026/3/21 8:00:46

api-rate-limitingtoken-bucketsliding-windowredisbackend-patterns

Japanese Translation

API のレート制限：トークンバケトルとスライドウィンドウ、Redis による実装 1 つの悪意のあるクライアントが 1 秒間に 10,000 回のリクエストを送信します。データベースは溶けます。他のすべてのユーザーは 503 エラーを受け取ります。レート制限は選択的ではありません。時間窓（例：1 分間に 100 回）あたりリクエスト数をカウントします。単純ですが境界の問題があります：1 分 59 秒に 100 回のリクエスト + 1 時に 100 回のリクエスト = 2 秒間に 200 回。すべてのリクエストのタイムスタンプを格納します。ウィンドウ内のエントリをカウントします。正確ですがメモリ消費が大きいです。トークンは定常速度で補充されます。各リクエストは 1 つのトークンを消費します。空になると拒否します。短時間のバーストを許容しつつ、平均速度を制御します。 class TokenBucket { private tokens: number; private lastRefill: number; constructor(private capacity: number, private refillRate: number) { this.tokens = capacity; this.lastRefill = Date.now(); } consume(): boolean { this.refill(); if (this.tokens < 1) return false; this.tokens--; return true; } private refill() { const now = Date.now(); const elapsed = (now - this.lastRefill) / 1000; this.tokens = Math.min(this.capacity, this.tokens + elapsed * this.refillRate); this.lastRefill = now; } } メモリアルレート制限は複数サーバーで失敗します。Redis スライドウィンドウカウンターを使用します： async function slidingWindowRateLimit( redis: Redis, key: string, limit: number, windowMs: number ): Promise<{ allowed: boolean; remaining: number }> { const now = Date.now(); const windowStart = now - windowMs; const pipeline = redis.pipeline(); pipeline.zremrangebyscore(key, 0, windowStart); pipeline.zadd(key, now, `${now}-${Math.random()}`); pipeline.zcard(key); pipeline.expire(key, Math.ceil(windowMs / 1000)); const results = await pipeline.exec(); const count = results[2][1] as number; return { allowed: count <= limit, remaining: Math.max(0, limit - count) }; } 常にレート制限ヘッダーを返し、クライアントが自己制限を実行できるようにしてください： X-RateLimit-Limit: 100 X-RateLimit-Remaining: 42 X-RateLimit-Reset: 1679529600 Retry-After: 30 IP 単位は単純ですが、共有された IP は不公平に制限されます。認証された API では API キー単位が最適です。ログイン済みのアプリケーションではユーザー単位が最適です。複数のものを組み合わせて：グローバル + ユーザー単位 + エンドポイント単位。メモリアルのみ：複数サーバーで失敗します。Redis を使用してください。ヘッダーなし：レート制限ヘッダーがないため、クライアントは自己制限を実行できません。統一された制限：読み取りエンドポイントには書き込みエンドポイントよりも高い制限が必要です。 429 にて Retry-After がない：クライアントに再試行できるタイミングを告知します。私の Production Backend Patterns シリーズの一部です。より多くの実用的なバックエンドエンジニアリングをフォローして取得してください。

Original Content

Rate Limiting Your API: Token Bucket, Sliding Window, and Redis One abusive client sends 10,000 requests per second. Your database melts. Every other user gets 503s. Rate limiting is not optional. Count requests per time window (e.g., 100 per minute). Simple but has the boundary problem: 100 requests at 0:59 + 100 at 1:00 = 200 in 2 seconds. Store timestamp of every request. Count entries within the window. Accurate but memory-hungry. Tokens refill at a steady rate. Each request consumes a token. When empty, reject. Allows short bursts while enforcing average rate. class TokenBucket { private tokens: number; private lastRefill: number; constructor(private capacity: number, private refillRate: number) { this.tokens = capacity; this.lastRefill = Date.now(); } consume(): boolean { this.refill(); if (this.tokens < 1) return false; this.tokens--; return true; } private refill() { const now = Date.now(); const elapsed = (now - this.lastRefill) / 1000; this.tokens = Math.min(this.capacity, this.tokens + elapsed * this.refillRate); this.lastRefill = now; } } In-memory rate limiting fails with multiple servers. Use Redis sliding window counter: async function slidingWindowRateLimit( redis: Redis, key: string, limit: number, windowMs: number ): Promise<{ allowed: boolean; remaining: number }> { const now = Date.now(); const windowStart = now - windowMs; const pipeline = redis.pipeline(); pipeline.zremrangebyscore(key, 0, windowStart); pipeline.zadd(key, now, `${now}-${Math.random()}`); pipeline.zcard(key); pipeline.expire(key, Math.ceil(windowMs / 1000)); const results = await pipeline.exec(); const count = results[2][1] as number; return { allowed: count <= limit, remaining: Math.max(0, limit - count) }; } Always return rate limit headers so clients can self-throttle: X-RateLimit-Limit: 100 X-RateLimit-Remaining: 42 X-RateLimit-Reset: 1679529600 Retry-After: 30 Per-IP is simple but shared IPs get unfairly limited. Per-API-key is best for authenticated APIs. Per-user for logged-in applications. Combine multiple: global + per-user + per-endpoint. In-memory only: Fails with multiple servers. Use Redis. No headers: Clients cannot self-throttle without rate limit headers. Same limit everywhere: Read endpoints need higher limits than write endpoints. 429 without Retry-After: Tell clients when they can retry. Part of my Production Backend Patterns series. Follow for more practical backend engineering.