NGINX Rate Limiting: Protect Your Server from Bots, Scrapers and Brute Force

Every server on the internet gets hammered. Credential stuffing bots testing username/password combinations. Scrapers pulling your entire site at 200 requests per second. DDoS floods trying to saturate your PHP workers. Bad actors hammering your login page or contact form. NGINX has a built-in rate limiting system that handles all of this — and it’s surprisingly powerful once you understand how it works.

This guide covers NGINX rate limiting from scratch: the core concepts, the directives, how to limit different endpoints differently, how to handle burst traffic without breaking legitimate users, and how to combine rate limiting with the dynamic modules from the myguard repository for even more control.

How NGINX Rate Limiting Works

NGINX rate limiting uses the leaky bucket algorithm. Think of it as a bucket with a small hole in the bottom. Requests flow in from the top; they drain out the bottom at a fixed rate. If requests arrive faster than the drain rate, the bucket fills up. Once full, excess requests are either delayed (held in a queue) or rejected with a 429 status code.

Two directives do the work:

  • limit_req_zone — defined in the http block; declares a shared memory zone and sets the rate
  • limit_req — applied in a server or location block; activates rate limiting using a named zone

Basic Rate Limiting Configuration

http {
    # Define a zone: track by IP, 10MB shared memory, 10 req/sec per IP
    limit_req_zone $binary_remote_addr zone=general:10m rate=10r/s;

    server {
        listen 443 ssl;
        server_name example.com;

        location / {
            limit_req zone=general;
            # ... rest of config
        }
    }
}

The $binary_remote_addr variable uses the binary representation of the IP address (4 bytes for IPv4, 16 for IPv6) as the key — more memory-efficient than the string form. A 10MB zone holds about 160,000 IP state entries.

Understanding Burst

A rate of 10r/s means NGINX allows one request per 100ms. If a user sends two requests in a 50ms window, the second one gets a 503 immediately. That’s too strict for real browsers — a page load triggers many simultaneous requests (HTML, CSS, JS, images).

The burst parameter adds a queue for excess requests:

location / {
    limit_req zone=general burst=20;
    # The burst queue holds up to 20 excess requests
    # They're delayed (not rejected) until the rate allows them through
}

With burst=20 and rate=10r/s: up to 20 requests can queue up and be processed in order. A 21st request gets a 503. This handles legitimate page load bursts without breaking the overall rate limit.

Add nodelay to process burst requests immediately instead of delaying them:

limit_req zone=general burst=20 nodelay;
# Burst requests are processed immediately, not queued
# The burst allowance still refills at the zone's rate

Protecting Specific Endpoints

Different endpoints need different limits. A login page needs much tighter limits than a static page. Define multiple zones:

http {
    # General traffic: 20 req/sec per IP
    limit_req_zone $binary_remote_addr zone=general:10m rate=20r/s;

    # Login endpoint: 5 req/min per IP (credential stuffing protection)
    limit_req_zone $binary_remote_addr zone=login:10m rate=5r/m;

    # API: 100 req/sec per IP
    limit_req_zone $binary_remote_addr zone=api:10m rate=100r/s;

    # Search: 10 req/min per IP (expensive queries)
    limit_req_zone $binary_remote_addr zone=search:10m rate=10r/m;

    server {
        location / {
            limit_req zone=general burst=30 nodelay;
        }

        location /wp-login.php {
            limit_req zone=login burst=3;
            # At 5r/min with burst=3: allows a brief login attempt
            # but hammering with 100 attempts/second gets a 429 immediately
        }

        location /api/ {
            limit_req zone=api burst=50 nodelay;
        }

        location /?s= {
            limit_req zone=search burst=2;
        }
    }
}

Rate Limiting by Multiple Keys

Rate limiting by IP alone can be too coarse — legitimate users behind corporate NAT or proxies share an IP. Rate limiting by IP + user agent or IP + URL gives more granular control:

http {
    # Rate limit by IP + URI (different budget per URL per IP)
    limit_req_zone "$binary_remote_addr$uri" zone=per_uri:20m rate=5r/s;

    # Rate limit authenticated users by user ID header
    # (your backend sets X-User-ID after auth)
    limit_req_zone $http_x_user_id zone=per_user:10m rate=50r/s;
}

Returning Custom 429 Responses

By default, rate-limited requests get a plain 503. Change this to a proper 429 (Too Many Requests) with a Retry-After header:

http {
    limit_req_status 429;

    server {
        # Custom JSON error for API clients
        error_page 429 /rate-limited.json;
        location = /rate-limited.json {
            internal;
            default_type application/json;
            return 429 '{"error":"rate_limit_exceeded","retry_after":60}';
            add_header Retry-After 60 always;
        }
    }
}

Whitelisting Trusted IPs

Internal monitoring tools, load balancer health checks, and your own IP should bypass rate limiting:

http {
    # Geo-based bypass: set $limit to empty string for trusted IPs
    geo $limit {
        default         $binary_remote_addr;  # Apply limit
        10.0.0.0/8      "";                   # Internal: no limit
        192.168.0.0/16  "";                   # Private: no limit
        203.0.113.42    "";                   # Your office IP: no limit
    }

    limit_req_zone $limit zone=general:10m rate=10r/s;

    # When $limit is empty, no rate-limit state is tracked
    # This is the correct zero-overhead bypass approach
}

Monitoring Rate Limit Events

NGINX logs rate limit rejections to the error log at warn level by default. To log them at error level (so they appear in your monitoring), or to suppress the log noise for expected traffic patterns:

location /wp-login.php {
    limit_req zone=login burst=3;
    limit_req_log_level error;   # Log rejections as errors (default: warn)
    # or:
    limit_req_log_level info;    # Suppress from error monitoring
}

# Count rate limit events in real-time
grep 'limiting requests' /var/log/nginx/error.log | wc -l
tail -f /var/log/nginx/error.log | grep 'limiting'

WordPress-Specific Rate Limiting

For a WordPress site, these are the endpoints worth rate limiting most aggressively:

http {
    limit_req_zone $binary_remote_addr zone=wp_login:10m   rate=5r/m;
    limit_req_zone $binary_remote_addr zone=wp_comments:5m rate=1r/m;
    limit_req_zone $binary_remote_addr zone=xmlrpc:5m      rate=1r/m;
    limit_req_zone $binary_remote_addr zone=general:20m    rate=30r/s;

    server {
        # Credential stuffing protection
        location = /wp-login.php {
            limit_req zone=wp_login burst=3;
        }

        # XML-RPC is a DDoS amplification target — block entirely if unused
        location = /xmlrpc.php {
            limit_req zone=xmlrpc burst=1;
            # Or just block it: return 403;
        }

        # Comment spam protection
        location = /wp-comments-post.php {
            limit_req zone=wp_comments burst=1;
        }

        # wp-admin: protect but allow legitimate admin use
        location /wp-admin/ {
            limit_req zone=general burst=20 nodelay;
        }
    }
}

Redis-Backed Cross-Server Rate Limiting

NGINX’s built-in rate limiting uses shared memory within a single server. If you have multiple NGINX instances behind a load balancer, each one has independent rate limit state — a bot that hits different servers can exceed your intended rate by a factor of N.

For true cross-server rate limiting, use the NGINX Lua module with Redis:

location /wp-login.php {
    access_by_lua_block {
        local redis = require "resty.redis"
        local red = redis:new()
        red:set_timeouts(50, 50, 50)
        red:connect("127.0.0.1", 6379)

        local key = "rl:login:" .. ngx.var.binary_remote_addr
        local count = red:incr(key)
        if count == 1 then red:expire(key, 60) end
        red:set_keepalive(10000, 20)

        if count and count > 5 then
            ngx.header["Retry-After"] = "60"
            return ngx.exit(429)
        end
    }
}

Frequently Asked Questions

What is the difference between limit_req and limit_conn?
limit_req limits the request rate (requests per second/minute). limit_conn limits simultaneous open connections. They’re complementary: limit_req stops bots from sending thousands of requests in a burst; limit_conn stops someone from holding thousands of connections open. Use both for full protection.
Will rate limiting block legitimate users?
Not if configured correctly. A real human browsing a website generates maybe 5–10 requests per second during active page loads. Set your general zone to 20–30r/s with burst=30 and legitimate users never see a 429. Bots trying to scrape or brute-force send orders of magnitude more — those get hit.
How much memory does a rate limit zone use?
About 64 bytes per tracked IP. A 10m zone (10 megabytes) holds roughly 160,000 IP addresses. For most sites this is more than enough. If you have a very high-traffic site with millions of unique IPs, increase the zone size proportionally.
Does rate limiting work with IPv6?
Yes — that’s why we use $binary_remote_addr (4 bytes for IPv4, 16 for IPv6) rather than $remote_addr (the string form, much larger). Note that rate limiting by individual IPv6 address can be gamed by rotating through a /64 block. For IPv6 you may want to rate limit by /64 subnet instead: set_real_ip_from and real_ip_header can help here.
What’s the difference between burst with and without nodelay?
Without nodelay: burst requests are queued and processed at the zone’s rate. The queue adds latency but smooths traffic. With nodelay: burst requests are processed immediately, but each one consumes a burst slot that refills at the zone’s rate. Nodelay is better for user-facing pages (no added latency); without it is better for backend-sensitive operations where you want strict pacing.
Can I rate limit by cookie or header instead of IP?
Yes — the zone key can be any NGINX variable. Use $cookie_session_id to rate limit by session, $http_x_api_key to rate limit by API key, or “$binary_remote_addr$http_user_agent” to rate limit by IP+UA combination. Just make sure the key has low cardinality or your zone memory fills up quickly.

Related Posts