Introduction

In the last couple of weeks i tried to customize a Web Application/ API penetration testing AI Agent and tested it against 100+ sites, which gave me a good vision of what are the strong/ weak points. I also tried to tell the agent about a specific CVE which existed in an environment and told it to find and exploit it which was a weird interaction, that lead to finally finding it later after some tries. At the end I'll share my opinion on how reliable using AI for security testing is and if it will eventually take replace penetration testers/ security engineers.

Setup Used

I relied on Claude Code to do the agents orchestration with Opus 4.6 as the LLM model. I just think the tools supported and the models they offer cover the needs here even though Opus is considered token expensive.

I also created a Skill to direct the agent through the PT process with a scenario based vulnerability checklist to eventually writing a report with the findings and some logs.

Additionally, a small Powershell script to loop through the scope and test each site/asset on a different session, to keep things clean in terms of context windows and to avoid compacting the conversation.

Testing Results

100+ sites in a production environment were scanned over 2 days almost, each one in a different session. I am showcasing only a sample of the Critical/ High vulnerabilities found, there's also tons of low impact ones, but these are the ones worth mentioning:

None

Couple of important points to address here:

  1. The tests were authorized BlackBox tests.
  2. Some of the findings were false positives or just logically incorrect.

Analyzing Results

The Agent was great at analyzing JavaScript (to extract Secrets/API endpoint/ external services etc.,), testing default and weak credentials, discovering authorization issues, finding business logic issues (not always but when explained well) and like just generally good in following the skill provided but changing methodology based on the scenario given.

Honestly, I feel it would be great for teams (specially internal teams) to have such autonomous agents running periodically where teams can customize them and introduce new methodologies to them.

Missing Things

As mentioned before, some of the vulnerabilities discovered were false positives or logically incorrect, like if an endpoint returns the string "2000 Passwords decrypted" , the agent will report it as a Critical, or it would report that registration is open on an admin panel, where it doesn't actually register users but only because a 200 response was received.

Also, for some reason sometimes the agent likes to cut corners or not going all the way in, like if a website has a register functionality, it would register a user and maybe test some functionalities inside, but if there's a lot of services inside it would just finish the test and start reporting, this probably can be fixed be altering the prompt or the skill.

There's also the issue of the models refusing to do some stuff due to security constraints applied to them. I saw AI vendors such as Anthropic doing a form for security whitelist which can solve this issue, but most of the time I was able to get a working POC without one.

The CVE Hunt

Note: this wasn't a hunt of a new CVE or a security research, I was just testing the agent ability to find a previously reported CVE.

I started with telling it to study the CVE it as i didn't want it to just to bring a POC from the internet and execute it, which it actually did and everything was great, the test was against about 15 servers.

On the first run the task was just to test these server against this CVE which came back with zero results, so i pushed on it again and told it to go and verify the results which didn't work as well :) On the third attempt i told it that there's a vulnerability there for sure which actually lead to it writing a python script to test across all servers and finally finding it.

None

Conclusion

Seeing what can be done with AI with a few prompts and a .md file is so impressive and can bring value to security researchers and people who work in IT in general. The scale it can work with and the speed are definitely things a human can take advantage of to make great things.

In some cases the agent was even able to do a full attack chain by chaining 3 or 4 different vulnerabilities which lead to critical impact findings.

With everything said, it's still not reliable to fully trust an agent to secure an environment without human guidance yet, maybe in a couple years, who knows.