Rate Limiting
Comprehensive guide to protecting your APIs with rate limiting in FlexGate.
What is Rate Limiting?
Rate limiting controls how many requests a client can make to your API within a time window. It protects your backend services from:
- Abuse - Malicious actors overwhelming your API
- DDoS attacks - Distributed denial of service
- Resource exhaustion - Too many requests consuming resources
- Cost control - Limiting usage for free tiers
- Fair usage - Ensuring all users get fair access
Quick Start
Enable basic rate limiting on a route:
routes:
- id: api-users
path: /api/users/*
upstream: http://backend:8080
rateLimit:
enabled: true
max: 100 # 100 requests
windowMs: 60000 # per minuteThis limits clients to 100 requests per minute.
Rate Limiting Strategies
FlexGate supports four rate limiting algorithms:
1. Token Bucket (Recommended)
How it works:
- Bucket holds tokens (requests)
- Tokens replenished at steady rate
- Requests consume tokens
- Empty bucket = rate limited
Configuration:
rateLimit:
enabled: true
strategy: token-bucket
max: 100 # Bucket capacity
windowMs: 60000 # Refill window (1 minute)
tokensPerInterval: 100 # Tokens added per windowAdvantages:
- ✅ Allows bursts (up to bucket capacity)
- ✅ Smooth over time
- ✅ Fair distribution
- ✅ Best for most use cases
Visualization:
Bucket (Capacity: 100)
┌─────────────────────┐
│ ████████████░░░░░░░ │ 60 tokens
└─────────────────────┘
↓
Request consumes 1 token
↓
┌─────────────────────┐
│ ███████████░░░░░░░░ │ 59 tokens
└─────────────────────┘
Tokens refilled: +100 every 60 secondsExample:
# API allows bursts up to 100 requests
# Then throttles to ~1.67 req/sec sustained
rateLimit:
strategy: token-bucket
max: 100
windowMs: 60000Use cases:
- General API protection
- Allowing temporary bursts
- User-facing APIs
2. Fixed Window
How it works:
- Counter resets at fixed intervals
- Simple and predictable
- Can allow bursts at boundaries
Configuration:
rateLimit:
enabled: true
strategy: fixed-window
max: 100
windowMs: 60000Timeline:
Time: 00:00 00:30 01:00 01:30 02:00
Window: [--- Window 1 ---][--- Window 2 ---]
Requests: 50 req 50 req | 100 req 0 req
✅ ✅ | ❌ ✅Advantages:
- ✅ Simple to understand
- ✅ Low memory usage
- ✅ Predictable resets
Disadvantages:
- ❌ Burst at window boundaries
- ❌ Can allow 2× limit (end + start of windows)
Example:
# Exactly 100 requests per minute
# Counter resets at :00, :01, :02, etc.
rateLimit:
strategy: fixed-window
max: 100
windowMs: 60000 # 1 minute windowsUse cases:
- Simple quotas
- Internal APIs
- When predictability matters
3. Sliding Window
How it works:
- Tracks requests in rolling window
- More accurate than fixed window
- Prevents boundary bursts
Configuration:
rateLimit:
enabled: true
strategy: sliding-window
max: 100
windowMs: 60000
precision: 60 # Number of sub-windowsHow it prevents bursts:
Fixed Window Problem:
00:59 → 01:00 → 01:01
50 req 100 req = 150 requests in 2 seconds!
Sliding Window Solution:
Continuously tracks last 60 seconds
No boundary burst possibleAdvantages:
- ✅ Most accurate
- ✅ No boundary bursts
- ✅ Fair distribution
Disadvantages:
- ❌ Higher memory usage
- ❌ More complex
Example:
# Strict 100 requests per 60 seconds
# Calculated continuously
rateLimit:
strategy: sliding-window
max: 100
windowMs: 60000
precision: 60 # 60 1-second bucketsUse cases:
- Strict rate limiting
- Preventing abuse
- Public APIs
4. Leaky Bucket
How it works:
- Requests queue in bucket
- Leak at constant rate
- Overflow = rate limited
Configuration:
rateLimit:
enabled: true
strategy: leaky-bucket
max: 100 # Queue size
rate: 10 # Leak rate (req/sec)
windowMs: 1000 # Rate windowVisualization:
Incoming Requests
↓↓↓↓↓
┌─────────────────┐
│ ████████████░░░ │ Queue (100 max)
│ ████████████░░░ │
└────────┬────────┘
↓ Leak: 10/sec
Processed RequestsAdvantages:
- ✅ Perfectly smooth output
- ✅ Predictable rate
- ✅ No bursts
Disadvantages:
- ❌ Can cause delays
- ❌ Queue management complexity
Example:
# Process at exactly 10 req/sec
# Queue up to 100 requests
rateLimit:
strategy: leaky-bucket
max: 100
rate: 10
windowMs: 1000Use cases:
- Backend protection
- Upstream rate limits
- Queue-based systems
Rate Limit Scope
Control what the limit applies to:
Per IP Address (Default)
Limit by client IP:
rateLimit:
enabled: true
scope: ip
max: 100
windowMs: 60000Use case: Prevent single IP from overwhelming API
Redis key: rate-limit:route-id:192.168.1.100
Per User
Limit by authenticated user:
rateLimit:
enabled: true
scope: user
max: 1000 # Higher limit for authenticated users
windowMs: 60000Requirements:
- Authentication enabled
- User ID available in request
Redis key: rate-limit:route-id:user:12345
Use case: Different limits per user tier
Per API Key
Limit by API key:
rateLimit:
enabled: true
scope: apiKey
max: 10000 # Enterprise API key limit
windowMs: 60000Redis key: rate-limit:route-id:key:abc123
Use case: API key quotas
Global (Per Route)
Limit total traffic to route:
rateLimit:
enabled: true
scope: route
max: 10000 # Total limit for all clients
windowMs: 60000Redis key: rate-limit:route-id:global
Use case: Protect backend capacity
Custom Scope
Combine multiple factors:
rateLimit:
enabled: true
scope: custom
keyGenerator: ip+user+apiKey
max: 100
windowMs: 60000Redis key: rate-limit:route-id:192.168.1.100:user123:key456
Use case: Complex rate limiting scenarios
Advanced Configuration
Different Limits by Method
rateLimit:
enabled: true
# Different limits per HTTP method
limits:
GET:
max: 1000
windowMs: 60000
POST:
max: 100
windowMs: 60000
DELETE:
max: 10
windowMs: 60000Tiered Rate Limiting
Different limits based on user tier:
rateLimit:
enabled: true
scope: user
tiers:
free:
max: 100
windowMs: 60000
pro:
max: 1000
windowMs: 60000
enterprise:
max: 100000
windowMs: 60000Implementation:
// Determine tier from JWT claims or database
const tier = req.user.subscriptionTier; // 'free', 'pro', 'enterprise'Dynamic Rate Limits
Adjust limits based on system load:
rateLimit:
enabled: true
dynamic: true
# Base limits
max: 1000
windowMs: 60000
# Reduce when system under load
loadThresholds:
cpu: 80 # % CPU usage
memory: 85 # % Memory usage
reduction: 50 # % reduction in limitExample:
- Normal: 1000 req/min
- High load (CPU > 80%): 500 req/min
Skip Conditions
Don't count certain requests:
rateLimit:
enabled: true
max: 100
windowMs: 60000
skip:
# Don't count successful requests
successfulRequests: false
# Don't count failed requests
failedRequests: false
# Skip based on headers
headers:
X-Admin-Token: secret123
# Skip based on IP
ips:
- 127.0.0.1
- 10.0.0.0/8
# Skip based on user role
roles:
- admin
- service-accountCustom Error Response
Customize rate limit response:
rateLimit:
enabled: true
max: 100
windowMs: 60000
response:
statusCode: 429
headers:
X-RateLimit-Limit: "${max}"
X-RateLimit-Remaining: "${remaining}"
X-RateLimit-Reset: "${reset}"
Retry-After: "${retryAfter}"
body:
error: "Rate Limit Exceeded"
message: "You have exceeded the rate limit of ${max} requests per ${window}."
retryAfter: "${retryAfter}"
documentation: "https://docs.example.com/rate-limits"Response example:
HTTP/1.1 429 Too Many Requests
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1707485723
Retry-After: 42
{
"error": "Rate Limit Exceeded",
"message": "You have exceeded the rate limit of 100 requests per 60000ms.",
"retryAfter": 42,
"documentation": "https://docs.example.com/rate-limits"
}Response Headers
FlexGate adds rate limit headers to responses:
Standard Headers
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 45
X-RateLimit-Reset: 1707485723Explanation:
Limit: Total requests allowed in windowRemaining: Requests left in current windowReset: Unix timestamp when limit resets
Additional Headers
X-RateLimit-Window: 60000
X-RateLimit-Scope: ip
X-RateLimit-Strategy: token-bucketRetry-After Header
When rate limited:
HTTP/1.1 429 Too Many Requests
Retry-After: 42Value: Seconds until limit resets
Storage Backend
Rate limit counters are stored in Redis:
Redis Configuration
redis:
host: localhost
port: 6379
db: 0
password: your-redis-password
# Rate limit specific settings
rateLimit:
prefix: "rate-limit:"
ttl: 86400 # 24 hoursRedis Key Structure
Keys follow this pattern:
rate-limit:{route-id}:{scope}:{identifier}Examples:
rate-limit:api-users:ip:192.168.1.100
rate-limit:api-posts:user:12345
rate-limit:api-search:apiKey:abc123
rate-limit:api-public:globalInspecting Redis
View rate limit data:
# List all rate limit keys
redis-cli KEYS "rate-limit:*"
# Get specific counter
redis-cli GET "rate-limit:api-users:ip:192.168.1.100"
# View TTL
redis-cli TTL "rate-limit:api-users:ip:192.168.1.100"
# Delete (reset) counter
redis-cli DEL "rate-limit:api-users:ip:192.168.1.100"Redis Failover
Handle Redis failures gracefully:
rateLimit:
enabled: true
max: 100
windowMs: 60000
# Fallback when Redis unavailable
fallback:
strategy: allow # or 'deny', 'memory'
# Use in-memory rate limiting
memory:
enabled: true
syncInterval: 5000 # Sync to Redis when availableStrategies:
allow- Allow requests (fail open)deny- Deny requests (fail closed)memory- Use in-memory store (not distributed)
Monitoring Rate Limits
Metrics
FlexGate exports rate limit metrics:
# Total rate limit hits
flexgate_rate_limit_hits_total{route="api-users",scope="ip"} 234
# Remaining quota
flexgate_rate_limit_remaining{route="api-users",scope="ip"} 45
# Blocked requests
flexgate_rate_limit_blocked_total{route="api-users",scope="ip"} 156Admin UI
View rate limiting in Admin UI:
Dashboard:
- Navigate to http://localhost:3000/admin/monitoring
- Rate Limiting section shows:
- Total blocked requests
- Top blocked IPs
- Rate limit hit rate
- Per-route statistics
Route Details:
- Go to Routes → Select route
- Rate Limiting tab shows:
- Current limits
- Hit statistics
- Blocked clients
- Reset times
Logs
Rate limit events are logged:
{
"timestamp": "2026-02-09T10:30:15.123Z",
"level": "warn",
"message": "Rate limit exceeded",
"route": "api-users",
"clientIp": "192.168.1.100",
"limit": 100,
"window": 60000,
"remaining": 0,
"resetAt": 1707485723
}Testing Rate Limits
Manual Testing
Test with curl:
# Send 105 requests quickly
for i in {1..105}; do
echo "Request $i:"
curl -w "Status: %{http_code}, Remaining: %header{X-RateLimit-Remaining}\n" \
http://localhost:3000/api/users \
-s -o /dev/null
doneExpected output:
Request 1: Status: 200, Remaining: 99
Request 2: Status: 200, Remaining: 98
...
Request 100: Status: 200, Remaining: 0
Request 101: Status: 429, Remaining: 0
Request 102: Status: 429, Remaining: 0Automated Testing
Test with Apache Bench:
# 1000 requests, 50 concurrent
ab -n 1000 -c 50 \
-H "Authorization: Bearer token123" \
http://localhost:3000/api/users
# Check resultsLoad Testing
Use k6 for comprehensive testing:
// rate-limit-test.js
import http from 'k6/http';
import { check } from 'k6';
export let options = {
stages: [
{ duration: '1m', target: 100 }, // Ramp to 100 users
{ duration: '3m', target: 100 }, // Stay at 100
{ duration: '1m', target: 0 }, // Ramp down
],
};
export default function () {
let res = http.get('http://localhost:3000/api/users');
check(res, {
'not rate limited': (r) => r.status !== 429,
'has rate limit headers': (r) => r.headers['X-Ratelimit-Limit'] !== undefined,
});
}Run test:
k6 run rate-limit-test.jsCommon Patterns
Public API Pattern
Different limits for public vs. authenticated:
routes:
- id: api-public
path: /api/public/*
upstream: http://backend:8080
rateLimit:
enabled: true
scope: ip
max: 10 # Low limit for anonymous
windowMs: 60000
- id: api-authenticated
path: /api/auth/*
upstream: http://backend:8080
auth:
enabled: true
type: jwt
rateLimit:
enabled: true
scope: user
max: 1000 # Higher for authenticated
windowMs: 60000Freemium Pattern
Tiered limits based on subscription:
rateLimit:
enabled: true
scope: apiKey
tiers:
free:
max: 100
windowMs: 86400000 # Per day
starter:
max: 10000
windowMs: 86400000
professional:
max: 100000
windowMs: 86400000
enterprise:
max: 1000000
windowMs: 86400000Resource-Based Pattern
Different limits per endpoint:
routes:
# Read-heavy endpoints
- id: api-users-list
path: /api/users
methods: [GET]
rateLimit:
max: 1000
windowMs: 60000
# Write endpoints (more expensive)
- id: api-users-create
path: /api/users
methods: [POST]
rateLimit:
max: 100
windowMs: 60000
# Search (very expensive)
- id: api-search
path: /api/search
rateLimit:
max: 10
windowMs: 60000Burst Protection Pattern
Allow bursts but limit sustained rate:
rateLimit:
strategy: token-bucket
# Allow burst of 100
max: 100
# But sustain only ~16.7 req/sec
tokensPerInterval: 100
windowMs: 60000Best Practices
1. Choose Appropriate Strategy
| Use Case | Strategy |
|---|---|
| General protection | Token Bucket |
| Simple quotas | Fixed Window |
| Strict limits | Sliding Window |
| Backend protection | Leaky Bucket |
2. Set Reasonable Limits
Too low:
- ❌ Frustrates legitimate users
- ❌ False positives
Too high:
- ❌ Doesn't protect backend
- ❌ Allows abuse
Just right:
- ✅ Based on backend capacity
- ✅ Monitor and adjust
- ✅ Different limits per tier
3. Use Appropriate Scope
| Scenario | Scope |
|---|---|
| Public API | IP + User |
| Internal API | User |
| Partner API | API Key |
| Backend protection | Route (global) |
4. Provide Clear Feedback
Always include:
- ✅ Rate limit headers
- ✅ Retry-After header
- ✅ Helpful error message
- ✅ Documentation link
5. Monitor and Alert
Track:
- Rate limit hit rate
- Most blocked IPs
- Unusual patterns
- False positives
6. Plan for Scale
Consider:
- Redis cluster for high traffic
- Distributed counters
- Eventual consistency
- Failover strategy
Troubleshooting
Rate Limits Not Working
Check:
- Rate limiting enabled:
rateLimit.enabled: true - Redis connected:
redis-cli ping - Correct scope configured
- TTL not expired immediately
Debug:
# Check Redis keys
redis-cli KEYS "rate-limit:*"
# Monitor Redis commands
redis-cli MONITOR
# Check FlexGate logs
flexgate logs --grep "rate limit"Too Many False Positives
Solutions:
- Increase limits
- Change scope (IP → User)
- Add skip conditions
- Use whitelist
Redis Memory Issues
Check memory:
redis-cli INFO memorySolutions:
- Set TTL on keys
- Use LRU eviction
- Increase Redis memory
- Scale Redis (cluster)
Inconsistent Limits
Cause: Multiple FlexGate instances with memory backend
Solution:
- ✅ Use Redis (distributed)
- ❌ Don't use memory backend in production
Next Steps
- Circuit Breaker - Complement with circuit breaker
- Authentication - Combine with auth
- Monitoring - Monitor rate limits
Questions? Join our GitHub Discussions.