When AI Agents Came for Our CTF, I Built a Gate!

By Saleh Alsaheb

Salehalsaheb

~6 min read · May 22, 2026 (Updated: May 22, 2026) · Free: Yes

I'm Saleh Alsaheb, a fourth-year Networks and Information Security Engineering student at PSUT. Penetration testing and web security are where I spend most of my time outside coursework, and last week the Cyber Security Club (CSC) at Princess Sumaya University for Technology (PSUT) ran its annual CTF. This year I was the lead writer for the web category — every web challenge that hit the scoreboard had to go through me.

This article is about a problem I spent months thinking about, and the tool I built to solve it.

The problem nobody wanted to say out loud

AI agents are getting scary good. Not "useful assistant" good — autonomous task-completing good. A team of three students with a well-tuned agent loop can point an LLM at a CTF web challenge and watch it enumerate, exploit, and exfiltrate a flag while they go get coffee. I knew that wave was coming for us. So in the months leading up to the CTF, I kept asking myself: how do I keep my challenges solvable for everyone, without handing the entire web category to whichever team has the best agent?

The "just make everything insane" trap

The first answer I heard from other writers was the obvious one: crank the difficulty. Make every challenge so hard that even with an agent, you'd need a strong human in the loop.

I wasn't convinced. We had 35 to 40 teams registered. Maybe 5 to 10 were going to lean heavily on agents. The rest were students learning, beginners taking their first real shot at a CTF, mid-tier teams trying to climb. If I made every web challenge brutal to punish a handful of agent-users, I was punishing everyone else more. The agents would chew through hard challenges anyway — that's what they're built for. The humans I actually wanted to reach would bounce off the category entirely.

My job was to keep the web track fair — varied difficulty, real ramp, something for each skill level — while making sure agents couldn't shred through it in the first hour. I needed a different kind of defense.

CAPTCHAs are already dead

The next idea was simple: drop a CAPTCHA in front of each challenge. A puzzle. A "click here to prove you're human" box.

I tried it. Agents passed.

Modern agents drive real browsers, and modern vision models read distorted text well enough to clear most consumer CAPTCHAs. The "I'm not a robot" checkbox is even easier. I watched an agent solve a CAPTCHA in seconds and realized the entire genre of "prove you're human" was, for my purposes, finished.

That was the moment the project actually started.

Stop detecting. Start costing.

Instead of asking "how do I tell humans from bots?" — a question that gets harder every quarter — I changed the question to: "how do I make automated solving more expensive than human solving?"

If an agent can clear my gate but only by burning a chunk of its context window, spending real wall-clock time, and breaking its normal autoplay loop, the economics flip. A team that wanted an agent to clear five web challenges in twenty minutes now needs that agent to babysit a verification step each time, fail, retry. Meanwhile a human player sees the same gate as a five-second interruption.

Not "no bots get through." Instead: "bots pay so much to get through that they're no better than a human."

Building the gate

I spent weeks designing and building it. I'm keeping the internals private — it's deployed live in our competitions and I want to keep the defense surface intact — but at a high level, it's a reverse proxy that sits in front of any web challenge and forces multi-layer human verification before any traffic reaches the backend.

The principles I locked in early:

- Cheap for humans, expensive for agents. Real player solves in five to ten seconds. Agent has to spend real tokens reasoning about it and real time waiting for rendering it can't shortcut.

- No offline solving. Every step is bound to a tight server-side time window-you can't download the puzzle, hand it to a vision model in another tab, and submit the answer separately.

- Sessions bound to the browser that solved them. Stealing a cookie after a human solves the gate doesn't give you free access.

- Layered, not single-point. Multiple independent signals — each one cheap to verify, each one another thing the attacker has to fake correctly.

- Tunable per challenge. The intro challenge wants a light gate. The high-value final challenge wants the hardest. That was the design choice I cared most about-and the one I almost learned the hard way.

Then it locked everyone out

Here's the part I learned the most from.

The first version went live with my challenges at the start of the CTF. About an hour in, the messages started coming. Teams couldn't get past the gate. Not because they'd failed it — because they'd solved it, gotten through, and been bounced back to solve another one. And another.

I'd set the difficulty too high. The verification was hard enough — and re-triggered aggressively enough on edge cases — that legitimate humans were getting stuck in a loop. The defense was working against the people I'd built it to protect.

In hindsight, the mistake was obvious. I'd spent months thinking about how to make the gate impossible for agents, and not enough about what happens when a tired student on a laptop at 3 AM mistypes one character. The agent was the abstract enemy. The player was the real user I'd forgotten to design for.

I dialed the difficulty down on the running instance and pushed the change while the CTF was still going. Teams started clearing the gate.

The category recovered.

After the event, I rebuilt the gate properly — not by weakening the security, but by separating concerns. Difficulty became a per-challenge dial the writer of each challenge could tune. The defenses themselves stayed strong; user-facing friction became something a CTF admin shapes per audience.

The hardest part of a defensive system isn't stopping attackers. It's stopping them without breaking the legitimate users in the process. You will get that balance wrong on the first run. Build assuming that.

What it stops, and what it doesn't

The gate raises the cost of automated solving sharply. Scripts that try to brute-force, agents that try to skip rendering, tools that try to replay a stolen session — all hit walls. Most of the cheap "point an agent at the URL and walk away" attacks die at the gate.

The gate does not stop a determined human in a real browser, and it does not stop a team that uses an agent like a slow, expensive consultant — paying the time and token cost, accepting the agent will be slower than a human. That's fine. That's the goal. If a team spends the resources to make an agent behave like a careful human, they've earned the solve and burned the economic advantage that made me worry about agents in the first place.

Closing thought

The interesting question in security right now is not how do we stop AI from doing anything — that ship has sailed — but how do we design systems where automated attackers don't get an unfair multiplier, without punishing the humans the system is meant to serve. CAPTCHAs are dead. Pure detection is a losing arms race. Economic asymmetry — making automation pay a real cost humans don't — still works, and I think we're going to see a lot more defenses built around that frame.

As far as I know, this is the first defense of its kind deployed in a CTF in our region. Built as a student, for CSC at PSUT. The internals stay private. The idea I want to share, because other CTF organizers are about to face the same problem, and "make every challenge insane" is not the answer.

Thanks for reading.

#ctf #cybersecurity #security #ai #penetration-testing

< Go to the original

When AI Agents Came for Our CTF, I Built a Gate!

By Saleh Alsaheb

Reporting a Problem