Block Vibe-Coded AI Exploit Scanners: Webserver Defense

Bots made up 49.6% of all internet traffic in 2024, and a third of everything that hit your server was a bad bot, according to Imperva’s 2024 Bad Bot Report. Sit with that number for a second. Half of the noise pounding your access log isn’t a human, isn’t Google, isn’t even an honest crawler. It’s a script. And lately a growing slice of those scripts were written by someone who has never read an RFC in their life, who asked a chatbot for “a Python tool that finds vulnerable WordPress sites,” pasted the answer, and pointed it at the entire IPv4 space before lunch.

Welcome to the vibe-coded exploit era. The barrier to writing an attack tool used to be skill. Now it’s a prompt. That sounds terrifying until you realise the same thing that makes these tools easy to make also makes them loud, sloppy, and gloriously easy to catch. This is the part nobody tells the juniors: you don’t beat volume with cleverness. You beat it with layers, each one cheap, each one boring, each one catching the trash the layer above let through.

Let me walk you through the whole stack, from the firewall rule that drops a scan before it costs you a CPU cycle, all the way down to the PHP setting that saves your bacon when everything above it fails. Because it will. Something always gets through. The only question is what’s waiting for it when it does.

Table of Contents

What a vibe-coded attack actually looks like on the wire

Here’s the thing about a tool an LLM wrote in thirty seconds: it has tells. Lots of them. Real attackers spend effort hiding. Vibe-coders spend effort shipping, and the model optimises for “works on the happy path,” not “evades a WAF.” So you get this beautiful pattern of self-incrimination.

The user-agent is usually a dead giveaway. python-requests/2.31.0. Go-http-client/1.1. curl/8.5.0 firing 400 requests a second. Nobody browsing your blog ships a default requests UA. The wordlists are recycled, the same /wp-login.php, /.env, /.git/config, /phpmyadmin sweep that every GitHub “scanner” repo has copied from the last one since 2019. The error handling is non-existent, so the tool keeps hammering a path that’s already returned 403 forty times, because the model never wrote a backoff. And the TLS handshake is naked: a stock Go or Python client has a fingerprint (its JA4 hash) as recognisable as a fingerprint at a crime scene, because the author never thought to randomise the cipher order. They didn’t know it was a thing.

That’s your edge. Every one of those tells is something you can match on, cheaply, before the request ever reaches your application. The defense isn’t one magic rule. It’s a sieve with six meshes, and the trash gets caught somewhere on the way down.

The whole picture: defense in depth, six layers

Before we go layer by layer, look at the shape of the thing. A request comes in at the top. Each layer drops what it can and passes the rest down. Whatever survives all six gets logged, studied, and fed back to the top so the next one dies sooner. That feedback loop on the right is the part most people skip, and it’s the part that turns a static wall into something that learns.

Exploit tool / scanner

↓

Layer 1 — Rate limiting & WAF

req/s limits · ModSecurity CRS · geo-blocking · IP reputation
→ Kills mass scans, fuzzing and brute force automatically

↓ passes through

Layer 2 — TLS hardening & security headers

TLSv1.3 only · HSTS · CSP · X-Frame-Options · server token off
→ Shrinks fingerprint surface, blocks browser-side injection

↓ passes through

Layer 3 — Request validation

max body size · method whitelist · path normalisation · null-byte blocking
→ Catches the badly crafted payloads LLM tools generate

↓ passes through

Layer 4 — Authentication & access control

mTLS · fail2ban coupling · JWT validation · admin paths off the internet
→ Raises the exploit bar hard for script kiddies

↓ passes through

Layer 5 — PHP layer hardening

disable_functions · open_basedir · Snuffleupagus · expose_php off
→ Turns a webshell upload into an expensive way to print "hello"

↓ passes through

Layer 6 — Observability & detection

structured logging · 4xx-ratio alerting · user-agent analysis · honeypots
→ Sees what got through, feeds the signal back to Layer 1

feedback

Each layer is independent. Defense in depth means no single bypass owns the box.

The golden rule of this diagram: every layer assumes the one above it failed. That’s not pessimism. That’s how you sleep at night. Now let’s build it.

Layer 1: rate limiting and the WAF

This is your front door and it does the most work for the least money. A mass scanner’s entire business model is volume. Take the volume away and most of them just fall over, because the author never wrote retry logic.

Start with rate limiting in nginx or Angie. Two zones: one for the general site, one tight zone for the login and API paths that bots love.

limit_req_zone $binary_remote_addr zone=general:10m rate=20r/s;
limit_req_zone $binary_remote_addr zone=login:10m rate=3r/m;
limit_conn_zone $binary_remote_addr zone=conns:10m;

server {
    limit_conn conns 20;

    location / {
        limit_req zone=general burst=40 nodelay;
    }

    location = /wp-login.php {
        limit_req zone=login burst=2 nodelay;
        limit_req_status 429;
    }
}

That rate=3r/m on the login path is not a typo. Three requests a minute. No human logs in faster than that, and a brute-force tool that expected to try ten thousand passwords now needs two days per IP. Most give up. The ones that don’t, Layer 4 handles.

On top of rate limiting, run a real WAF. ModSecurity with the OWASP Core Rule Set is the standard, and we have a full step-by-step guide to installing ModSecurity and the OWASP CRS on nginx if you’re starting from zero. The CRS is a giant pile of regexes that catch SQL injection, path traversal, command injection, the lot. Vibe-coded payloads get caught by it constantly, because the model generated a textbook ' OR 1=1-- that the CRS has matched since the Obama administration. Running WordPress? Our WordPress Hardening Plugin ships CRS-aware rules that block the common attacks without you editing a line of PHP, and the full writeup explains what it catches.

Start the CRS in DetectionOnly mode. I mean it. The day I flipped a fresh CRS install straight to blocking on a client site, paranoia level 2, I took down their checkout because a legitimate product description contained the word “select” near a quote. Six hours of my Saturday, gone, chasing a false positive that a week in detection mode would have shown me on day one.

SecRuleEngine DetectionOnly
SecDefaultAction "phase:2,log,auditlog,pass"
# After a week of clean logs, flip to:
# SecRuleEngine On
# SecDefaultAction "phase:2,log,auditlog,deny,status:403"

Then layer in cheap reputation filtering. You don’t need a paid threat feed to start. A map that drops the default scanner user-agents costs nothing and catches an embarrassing amount of vibe-coded traffic:

map $http_user_agent $bad_ua {
    default            0;
    ~*python-requests  1;
    ~*Go-http-client   1;
    ~*\bzgrab\b        1;
    ~*\bnuclei\b       1;
    ~*masscan          1;
    ""                 1;   # empty UA is almost always a bot
}

server {
    if ($bad_ua) { return 444; }   # 444 = drop the connection, no response
}

Return 444, not 403. A 403 is a polite “no” that tells the tool the path exists and the server is alive. 444 closes the socket with nothing. The scanner’s bad error handling does the rest: it logs a connection reset, shrugs, and moves on, having learned exactly nothing about you.

One war story for the road. Geo-blocking is tempting and mostly fine, but don’t blanket-ban entire countries without thinking about who lives there. I once watched a team block all of “Asia” with a GeoIP rule and then spend a frantic afternoon wondering why their own CDN’s Singapore PoP started failing health checks. Block what you must, allowlist what you need, and write a comment in the config saying why, because future-you will not remember.

Layer 2: TLS hardening and security headers

Layer 1 dropped the loud ones. Layer 2 shrinks how much you tell the rest. Every byte of metadata you leak is a byte a tool can match on, and a thinner fingerprint means more of those automated tools simply don’t know what they’re looking at.

Kill the old TLS. On a modern stack there is no reason to speak anything below TLSv1.2, and TLSv1.3 should be your floor wherever your client base allows it.

ssl_protocols TLSv1.3 TLSv1.2;
ssl_prefer_server_ciphers off;
ssl_conf_command Ciphersuites TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256:TLS_AES_128_GCM_SHA256;
ssl_session_tickets off;
ssl_stapling on;
ssl_stapling_verify on;

Then stop introducing yourself to strangers. server_tokens off hides your exact nginx version, so a scanner can’t map you straight to a CVE list. On Angie there’s a bit more you can do, and if you want the gory details of running a hardened build, the whole point of our extended Angie package is that the security knobs are already turned the right way.

Now the headers. These are mostly about the second class of attack: not the scanner hitting your server, but the payload that tries to run in your visitor’s browser. Set them once, in a snippet you include everywhere.

add_header Strict-Transport-Security "max-age=63072000; includeSubDomains; preload" always;
add_header Content-Security-Policy "default-src 'self'; object-src 'none'; frame-ancestors 'self'; base-uri 'self'" always;
add_header X-Content-Type-Options "nosniff" always;
add_header X-Frame-Options "SAMEORIGIN" always;
add_header Referrer-Policy "strict-origin-when-cross-origin" always;
add_header Permissions-Policy "geolocation=(), microphone=(), camera=()" always;

That always keyword matters more than it looks. Without it, nginx skips the header on error responses, which means your 404 and 500 pages ship naked. Guess which pages an attacker is most likely to be staring at? Right. Add always or the header is theatre.

A word on Content-Security-Policy, since it’s the one that pages you at 3 a.m. A tight CSP is the single most effective control against cross-site scripting, and it’s also the one most likely to break your own site the moment someone adds an inline script. Roll it out in Content-Security-Policy-Report-Only first, watch the violation reports, then enforce. The compression side-channel angle matters here too: if you’re serving secrets over a compressed, dynamic response, read up on the BREACH attack before you assume HTTPS alone has you covered. It doesn’t.

Layer 3: request validation at the edge

This layer is where vibe-coded tools really show their underwear. A human-written exploit is crafted to look plausible. An LLM-generated one is often syntactically valid garbage: a 40 MB request body to a login form, a PUT where a POST belongs, a path with a null byte in it because the model copied a 2008 Stack Overflow answer about bypassing file extension checks.

Cap the body size. Tightly. WordPress media uploads aside, most of your endpoints have no business accepting megabytes.

client_max_body_size 2m;

location = /xmlrpc.php { deny all; }   # nobody misses it

location = /wp-login.php {
    client_max_body_size 64k;
}

Whitelist methods. If an endpoint only ever sees GET and POST, say so, and watch the OPTIONS and TRACE probes bounce.

location /api/ {
    limit_except GET POST {
        deny all;
    }
}

nginx normalises paths before matching, which already eats a lot of the classic ../../../etc/passwd traversal noise, but don’t lean on that alone. The CRS from Layer 1 has dedicated traversal and null-byte rules, and they fire hard on exactly the kind of malformed input these tools spray. The combination is the point: nginx normalises, the CRS inspects, and the request that’s still standing is one a careful human wrote, which is a much smaller pile to worry about.

Here’s the trap nobody warns you about. merge_slashes is on by default, which collapses // into /. Sounds helpful. It also means a poorly written upstream auth check that keys on the literal path string can be bypassed with a doubled slash. The default is right for most people. Just know it’s there before someone files a bug claiming your access rules “randomly” don’t apply.

Layer 4: authentication and access control

Everything above is about cutting noise. This layer is about making sure the few requests that reach something sensitive have actually earned it. The single most effective thing you can do here costs nothing: take your admin surface off the public internet.

Your /wp-admin, your phpMyAdmin, your Grafana, your staging site. None of it needs to face the whole planet. Bind it to a VPN range or allowlist your own IPs and the entire category of “scanner finds login page, scanner brute-forces login page” evaporates.

location ^~ /wp-admin/ {
    allow 10.8.0.0/24;     # your WireGuard subnet
    allow 203.0.113.7;     # office static IP
    deny all;
}

# keep admin-ajax.php public, the front-end needs it
location = /wp-admin/admin-ajax.php { allow all; }

That admin-ajax carve-out is the gotcha. Block all of /wp-admin/ without it and half your plugins’ front-end features stop working, and you’ll get a “the site is broken” ticket before the page cache even expires. Ask me how I know.

Couple your WAF to fail2ban so repeated offenders get banned at the firewall, where blocking costs a single iptables/nftables rule instead of a full request cycle through nginx and ModSecurity.

# /etc/fail2ban/jail.local
[nginx-limit-req]
enabled  = true
filter   = nginx-limit-req
logpath  = /var/log/nginx/error.log
maxretry = 10
findtime = 60
bantime  = 3600

[nginx-badbots]
enabled  = true
port     = http,https
filter   = nginx-badbots
logpath  = /var/log/nginx/access.log
maxretry = 2
bantime  = 86400

For anything you actually expose, like an API, validate the credential properly. If you’re handing out JWTs, check the signature and the algorithm, because the classic alg: none bypass is exactly the kind of thing a vibe-coded tool will try, having read about it in the same blog post you did. Mutual TLS (mTLS) is the heavier hammer for machine-to-machine traffic: if the client can’t present a cert you issued, the handshake never completes and your application code never runs. No request, no attack surface.

Layer 5: the PHP layer, assume the WAF missed one

Every layer so far lived in front of your application. But the whole premise of defense in depth is that one day a request gets all the way through, and on a WordPress or PHP site, that request lands in the PHP interpreter. So you harden the interpreter too, on the assumption that it’s the last thing standing.

Start in php.ini. Stop announcing your version, and lock down the file system the interpreter can see.

expose_php = Off
display_errors = Off
log_errors = On
allow_url_fopen = Off
allow_url_include = Off
open_basedir = /var/www/html:/tmp
disable_functions = exec,passthru,shell_exec,system,proc_open,popen,pcntl_exec

That disable_functions line is the one that turns a successful file-upload exploit into a dead end. The whole point of dropping a PHP webshell is to call system() and run commands. Take those functions away and the shell the attacker just uploaded is an expensive way to print “hello.” It won’t stop every exploit. It absolutely wrecks the most common one.

The caveat: some plugins legitimately call exec(), and the day you blanket-ban it you might break a backup plugin that shells out to mysqldump. Test in staging. Read the error log. This is the recurring theme of the whole job, isn’t it. Every good hardening control breaks something the first time, and the difference between a senior and a junior is that the senior expected it.

For real teeth, run Snuffleupagus. It’s a PHP module that does virtual-patching and runtime hardening, the spiritual successor to the old Suhosin patch, and it can bind dangerous functions to specific call sites and kill the rest. We package it, and the ready-made myguard.rules ship inside our hardened Docker images on GitHub, because the default PHP posture is too trusting. Our step-by-step Snuffleupagus tutorial walks through writing those rules from scratch if you’d rather understand them than copy them. And keep an object cache like Redis or Valkey in front of your database so the brute-force login attempts that do slip through aren’t also hammering MySQL into the ground. If you’re weighing the options there, our writeup on Valkey, the Redis fork that actually won covers why we default to it now.

Layer 6: observability, or how you find out what got through

Here’s the uncomfortable truth. Something will get through. A WAF is a probabilistic filter, not a wall, and anyone who tells you their setup is unbreakable is selling something or hasn’t been breached yet. Layer 6 is the difference between learning about a compromise from your own dashboard and learning about it from a stranger on Twitter.

log_format json_combined escape=json '{'
  '"time":"$time_iso8601",'
  '"ip":"$remote_addr",'
  '"status":$status,'
  '"method":"$request_method",'
  '"uri":"$request_uri",'
  '"ua":"$http_user_agent",'
  '"ja4":"$ssl_ja4"'
'}';
access_log /var/log/nginx/access.json json_combined;

That $ssl_ja4 field is the quiet hero. JA4 is a TLS client fingerprint, and because our Angie and nginx builds ship the JA4 module, you get the attacker’s handshake signature on every line. When a vibe-coded tool sweeps you, every request shares one JA4 hash, because the author never randomised the TLS stack. One hash, ten thousand requests, four hundred distinct paths, ninety percent 404s. That’s not a user. That’s a banner that says “ban me.”

Alert on the 4xx ratio, not just on errors. A sudden spike of 404s from a single IP or JA4 is a scan in progress. Pipe the logs into anything (Loki, an ELK stack, a cron job and a shell script if you’re scrappy) and trip an alert when one client crosses, say, fifty 404s in a minute.

And plant a honeypot. A path that no human or legitimate crawler should ever request, like /wp-admin-secret-backup/, with one job: any IP that touches it gets instantly added to your deny list. Real users never find it. Scanners, working from their recycled wordlists, find it constantly. It’s the cheapest, highest-signal trap you can set, and it feeds straight back to Layer 1. That’s the loop closing. The thing that got caught at the bottom teaches the top to catch it sooner next time.

Docker isolation: shrink the blast radius

Say the worst happens. The WAF missed it, the PHP hardening didn’t catch it, and an attacker has code execution inside your web container. Defense in depth says: fine, now what can they reach? If the answer is “the whole host as root,” you built a wall with a door behind it. If the answer is “a read-only filesystem, no capabilities, no other containers,” you’ve turned a breach into an inconvenience.

Run the container as a non-root user, read-only, with every Linux capability dropped and no ability to gain new ones.

services:
  web:
    image: angie:hardened
    read_only: true
    user: "1000:1000"
    cap_drop: [ALL]
    security_opt:
      - no-new-privileges:true
    tmpfs:
      - /tmp
      - /run
    networks: [frontend]

A read-only root filesystem alone neutralises a huge class of exploits, because the attacker can’t write their toolkit anywhere persistent. cap_drop: ALL means even if they’re “root” inside the container, that root can’t load kernel modules, can’t mess with the network stack, can’t do most of what root implies. And network segmentation means the compromised web container can’t pivot to your database container, because they don’t share a network. This is the whole game: containment. We went deep on every one of these flags in the Docker hardening guide for self-hosters, so I won’t repeat the lot here. And it’s not theory: the same flags lock down our own hardened Roundcube webmail image, which runs as nobody and can chown nothing.

Patching: the boring layer that does the most

I saved the least glamorous one for last, because it’s the one that actually matters most, and the one juniors skip because it’s boring. Here’s the flat truth, no hedging: the overwhelming majority of “hacks” are not clever zero-days. They are a known CVE, with a patch available for months, against software nobody updated. The vibe-coded tool sweeping you isn’t smart. It’s just checking whether you did your homework.

So do your homework automatically. On Debian and Ubuntu, turn on unattended security upgrades and stop thinking about it.

apt install unattended-upgrades
dpkg-reconfigure -plow unattended-upgrades

Keep your container images rebuilt on a schedule, not whenever you happen to remember. A “latest” tag you pulled eight months ago is not latest, it’s a museum piece full of CVEs. Our Angie and nginx Docker images rebuild daily for exactly this reason, so the base layer you pull from Docker Hub was patched this morning, not last winter. The reason this works: the gap between “CVE published” and “exploit tool ships it” is now days, sometimes hours, because the tool-maker can ask a model to write the exploit from the CVE description. The patch window has collapsed. Automated patching is no longer optional hygiene, it’s the only way to keep pace with automated attacks.

There’s a grim symmetry to it. The same AI that lets a kid write a scanner in one prompt also lets defenders find bugs faster, as the curl project discovered when AI tooling dredged up a record pile of vulnerabilities (and a flood of garbage reports alongside the real ones). If you want that whole saga, we wrote up curl’s record AI-found vulnerability patch separately. The arms race is real, both sides got a force multiplier, and the only people who lose are the ones running unpatched software and hoping.

What is a vibe-coded exploit tool?

It’s an attack script written largely by an AI model rather than a skilled human. Someone asks a chatbot for something like a vulnerability scanner, pastes the output, and runs it. The barrier used to be skill; now it’s a prompt. The upside for defenders is that these tools are loud and sloppy: default user-agents, recycled wordlists, no retry logic, and unrandomised TLS fingerprints, all of which make them easy to catch at the edge.

Can a WAF alone stop AI-generated attacks?

No, and anyone claiming otherwise is overselling. A WAF like ModSecurity with the OWASP CRS catches a large share of textbook payloads, but it’s a probabilistic filter, not a wall. That’s why defense in depth matters: rate limiting in front, request validation and access control behind it, PHP and container hardening below that, and observability to catch whatever slips through. No single layer owns the box.

Why return 444 instead of 403 to bad bots?

A 403 is a valid HTTP response that confirms your server is alive and the path exists, which is information a scanner logs and uses. nginx’s non-standard 444 closes the connection with no response at all. The tool records a connection reset and moves on, having learned nothing. Against tools with poor error handling, which describes most vibe-coded ones, 444 is strictly better.

What is JA4 and how does it help against scanners?

JA4 is a fingerprint of a client’s TLS handshake, the cipher order, extensions, and version it offers. A stock Python or Go HTTP client has a fixed, recognisable JA4 hash because its author never randomised the TLS stack. Log the JA4 of every request and a mass scan shows up as thousands of requests sharing one hash, which is a clean signal to rate-limit or ban. The myguard Angie and nginx builds ship the JA4 module.

Is automatic patching safe for a production server?

For security updates on Debian and Ubuntu, unattended-upgrades is well-tested and the risk of skipping patches now far outweighs the small risk of a bad update, because the window between a CVE going public and an exploit tool shipping it has collapsed to days. Pin it to the security pocket only, keep backups, and rebuild container images on a schedule so your base layer isn’t a museum piece full of known holes.

Does taking wp-admin off the public internet break the site?

Not if you do it right. Allowlist your VPN range and static IPs for /wp-admin/, but keep /wp-admin/admin-ajax.php public, because front-end plugin features depend on it. Block that one file by accident and you’ll get a ‘site is broken’ ticket fast. With the carve-out in place, the entire ‘scanner finds login, scanner brute-forces login’ attack category disappears.

Where to start tomorrow morning

You don’t build all six layers in one afternoon, and you shouldn’t try. Pick the cheapest, highest-impact wins first. Turn on rate limiting and the bad-UA map today, they take ten minutes and cut the noise immediately. Put the WAF in detection mode this week and read its logs before you flip it to blocking. Take your admin surface off the public internet this month. Turn on unattended-upgrades right now, before you close this tab, because that’s the one that quietly handles the attacks you’ll never even see.

The attackers got a force multiplier. So did you. The tools to build every layer above are free, open source, and mostly a few lines of config. The vibe-coders are betting you didn’t bother. Prove them wrong.

Anyway. Go check whether your last backup actually restores before you touch any of this. You did test it, right?