Nginx Rate Limit Config Generator
Generate nginx rate-limiting configuration the right way. Set a request rate and the key to throttle on, choose a burst size and whether excess is served immediately (nodelay), delayed, or rejected, optionally add a connection limit, and pick the memory zone size and the HTTP status returned when the limit trips. You get the matching http { } and server { } snippets plus a plain-English summary. It builds live in your browser.
How to use the Nginx Rate Limit Config Generator
Choose what to throttle on with the key — almost always the client IP via $binary_remote_addr, which is the compact binary form that keeps the shared-memory zone small. Set the rate (requests per second or minute) and the zone memory: roughly 16,000 IP states fit per megabyte, so 10 MB tracks ~160,000 clients. The generator emits a limit_req_zone for the http { } block and a limit_req for your location, plus the matching connection-limit directives if you enable them.
The subtle part is burst and nodelay. The base rate is enforced as a steady trickle, so without a burst even slightly bunched-up legitimate requests get rejected. A burst allows a queue of that many excess requests; nodelay serves them immediately as long as the queue isn't full (the usual choice for APIs), delay=N serves the first N immediately and paces the rest, and queue-all paces everything. Requests beyond rate-plus-burst receive the status you set — 429 Too Many Requests is the correct code. Put the limit_req_zone line in the http context and the limit_req line in the relevant server or location, then reload with nginx -t && nginx -s reload.
How nginx rate limiting works
Nginx rate limiting is built on the leaky bucket algorithm. You declare a shared-memory zone with limit_req_zone that records, per key, the timestamp of the last request; limit_req then enforces a target rate by treating requests as water draining from a bucket at a constant rate. Requests that arrive faster than the rate would overflow the bucket and are rejected — unless a burst gives the bucket extra capacity to absorb short spikes. This is why a bare rate with no burst feels surprisingly strict: real traffic is bursty, and a 10 r/s limit literally means one request every 100 ms, so two requests 50 ms apart already violate it.
The key defines what gets limited. Keying on $binary_remote_addr limits per client IP; keying on a URI or a header limits per endpoint or per token. The choice interacts with your topology: behind a CDN or load balancer every request appears to come from the proxy's IP, so you must either trust and parse X-Forwarded-For via the realip module or accept that limiting will be coarse. The zone memory bounds how many distinct keys can be tracked; when it fills, nginx evicts the oldest entries, so size it for your expected client population.
The burst/nodelay combination shapes the experience. With nodelay, the burst capacity lets bunched requests through instantly while still capping the sustained rate — ideal for APIs where latency matters. Without it, excess requests are delayed to smooth traffic into a steady stream, which protects a fragile backend at the cost of added latency. Anything beyond the rate-plus-burst allowance is refused with limit_req_status (use 429). A separate mechanism, limit_conn, caps the number of simultaneous connections per key rather than their rate — useful against slow-loris-style abuse and large concurrent downloads. The two are complementary, and this generator can emit both. Always validate with nginx -t before reloading.
Common use cases
- API protection. Cap requests per IP with a burst and nodelay so legitimate spikes pass but abuse is throttled.
- Login endpoints. Apply a strict per-IP rate to slow brute-force attempts against authentication.
- Backend shielding. Use delayed limiting to smooth bursty traffic into a steady stream a fragile service can handle.
- Connection caps. Add limit_conn to bound simultaneous connections against slow-client and download abuse.