Every server on the internet gets hammered. Credential stuffing bots testing username/password combinations. Scrapers pulling your entire site at 200 requests per second. DDoS floods trying to saturate your PHP workers. Bad actors hammering your login page or contact form. NGINX has a built-in rate limiting system that handles all of this — and it’s surprisingly powerful once you understand how it works.
This guide covers NGINX rate limiting from scratch: the core concepts, the directives, how to limit different endpoints differently, how to handle burst traffic without breaking legitimate users, and how to combine rate limiting with the dynamic modules from the myguard repository for even more control.
How NGINX Rate Limiting Works
NGINX rate limiting uses the leaky bucket algorithm. Think of it as a bucket with a small hole in the bottom. Requests flow in from the top; they drain out the bottom at a fixed rate. If requests arrive faster than the drain rate, the bucket fills up. Once full, excess requests are either delayed (held in a queue) or rejected with a 429 status code.
Two directives do the work:
limit_req_zone— defined in thehttpblock; declares a shared memory zone and sets the ratelimit_req— applied in aserverorlocationblock; activates rate limiting using a named zone
Basic Rate Limiting Configuration
http {
# Define a zone: track by IP, 10MB shared memory, 10 req/sec per IP
limit_req_zone $binary_remote_addr zone=general:10m rate=10r/s;
server {
listen 443 ssl;
server_name example.com;
location / {
limit_req zone=general;
# ... rest of config
}
}
}
The $binary_remote_addr variable uses the binary representation of the IP address (4 bytes for IPv4, 16 for IPv6) as the key — more memory-efficient than the string form. A 10MB zone holds about 160,000 IP state entries.
Understanding Burst
A rate of 10r/s means NGINX allows one request per 100ms. If a user sends two requests in a 50ms window, the second one gets a 503 immediately. That’s too strict for real browsers — a page load triggers many simultaneous requests (HTML, CSS, JS, images).
The burst parameter adds a queue for excess requests:
location / {
limit_req zone=general burst=20;
# The burst queue holds up to 20 excess requests
# They're delayed (not rejected) until the rate allows them through
}
With burst=20 and rate=10r/s: up to 20 requests can queue up and be processed in order. A 21st request gets a 503. This handles legitimate page load bursts without breaking the overall rate limit.
Add nodelay to process burst requests immediately instead of delaying them:
limit_req zone=general burst=20 nodelay;
# Burst requests are processed immediately, not queued
# The burst allowance still refills at the zone's rate
Protecting Specific Endpoints
Different endpoints need different limits. A login page needs much tighter limits than a static page. Define multiple zones:
http {
# General traffic: 20 req/sec per IP
limit_req_zone $binary_remote_addr zone=general:10m rate=20r/s;
# Login endpoint: 5 req/min per IP (credential stuffing protection)
limit_req_zone $binary_remote_addr zone=login:10m rate=5r/m;
# API: 100 req/sec per IP
limit_req_zone $binary_remote_addr zone=api:10m rate=100r/s;
# Search: 10 req/min per IP (expensive queries)
limit_req_zone $binary_remote_addr zone=search:10m rate=10r/m;
server {
location / {
limit_req zone=general burst=30 nodelay;
}
location /wp-login.php {
limit_req zone=login burst=3;
# At 5r/min with burst=3: allows a brief login attempt
# but hammering with 100 attempts/second gets a 429 immediately
}
location /api/ {
limit_req zone=api burst=50 nodelay;
}
location /?s= {
limit_req zone=search burst=2;
}
}
}
Rate Limiting by Multiple Keys
Rate limiting by IP alone can be too coarse — legitimate users behind corporate NAT or proxies share an IP. Rate limiting by IP + user agent or IP + URL gives more granular control:
http {
# Rate limit by IP + URI (different budget per URL per IP)
limit_req_zone "$binary_remote_addr$uri" zone=per_uri:20m rate=5r/s;
# Rate limit authenticated users by user ID header
# (your backend sets X-User-ID after auth)
limit_req_zone $http_x_user_id zone=per_user:10m rate=50r/s;
}
Returning Custom 429 Responses
By default, rate-limited requests get a plain 503. Change this to a proper 429 (Too Many Requests) with a Retry-After header:
http {
limit_req_status 429;
server {
# Custom JSON error for API clients
error_page 429 /rate-limited.json;
location = /rate-limited.json {
internal;
default_type application/json;
return 429 '{"error":"rate_limit_exceeded","retry_after":60}';
add_header Retry-After 60 always;
}
}
}
Whitelisting Trusted IPs
Internal monitoring tools, load balancer health checks, and your own IP should bypass rate limiting:
http {
# Geo-based bypass: set $limit to empty string for trusted IPs
geo $limit {
default $binary_remote_addr; # Apply limit
10.0.0.0/8 ""; # Internal: no limit
192.168.0.0/16 ""; # Private: no limit
203.0.113.42 ""; # Your office IP: no limit
}
limit_req_zone $limit zone=general:10m rate=10r/s;
# When $limit is empty, no rate-limit state is tracked
# This is the correct zero-overhead bypass approach
}
Monitoring Rate Limit Events
NGINX logs rate limit rejections to the error log at warn level by default. To log them at error level (so they appear in your monitoring), or to suppress the log noise for expected traffic patterns:
location /wp-login.php {
limit_req zone=login burst=3;
limit_req_log_level error; # Log rejections as errors (default: warn)
# or:
limit_req_log_level info; # Suppress from error monitoring
}
# Count rate limit events in real-time
grep 'limiting requests' /var/log/nginx/error.log | wc -l
tail -f /var/log/nginx/error.log | grep 'limiting'
WordPress-Specific Rate Limiting
For a WordPress site, these are the endpoints worth rate limiting most aggressively:
http {
limit_req_zone $binary_remote_addr zone=wp_login:10m rate=5r/m;
limit_req_zone $binary_remote_addr zone=wp_comments:5m rate=1r/m;
limit_req_zone $binary_remote_addr zone=xmlrpc:5m rate=1r/m;
limit_req_zone $binary_remote_addr zone=general:20m rate=30r/s;
server {
# Credential stuffing protection
location = /wp-login.php {
limit_req zone=wp_login burst=3;
}
# XML-RPC is a DDoS amplification target — block entirely if unused
location = /xmlrpc.php {
limit_req zone=xmlrpc burst=1;
# Or just block it: return 403;
}
# Comment spam protection
location = /wp-comments-post.php {
limit_req zone=wp_comments burst=1;
}
# wp-admin: protect but allow legitimate admin use
location /wp-admin/ {
limit_req zone=general burst=20 nodelay;
}
}
}
Redis-Backed Cross-Server Rate Limiting
NGINX’s built-in rate limiting uses shared memory within a single server. If you have multiple NGINX instances behind a load balancer, each one has independent rate limit state — a bot that hits different servers can exceed your intended rate by a factor of N.
For true cross-server rate limiting, use the NGINX Lua module with Redis:
location /wp-login.php {
access_by_lua_block {
local redis = require "resty.redis"
local red = redis:new()
red:set_timeouts(50, 50, 50)
red:connect("127.0.0.1", 6379)
local key = "rl:login:" .. ngx.var.binary_remote_addr
local count = red:incr(key)
if count == 1 then red:expire(key, 60) end
red:set_keepalive(10000, 20)
if count and count > 5 then
ngx.header["Retry-After"] = "60"
return ngx.exit(429)
end
}
}
Frequently Asked Questions
Related Posts
- NGINX ModSecurity WAF Setup — pair rate limiting with WAF for comprehensive attack protection
- NGINX Lua Module Guide — Redis-backed rate limiting for multi-server setups
- NGINX Dynamic Modules Overview — all 50+ modules including the GeoIP2 module for geo-based limits
- NGINX Performance and Security Expert Guide — full security and performance hardening guide
- Angie Web Server Complete Guide — rate limiting works identically on Angie