June 2, 2026
Spy Before You Prompt: Passive Recon for LLM Agents, RAG and the AI Stack
How seven read-only techniques mapped five LLM agents, exposed Swagger, and flagged debug endpoints — in 66 seconds, without sending a…
Jonathan (PardaL)
8 min read
How seven read-only techniques mapped five LLM agents, exposed Swagger, and flagged debug endpoints — in 66 seconds, without sending a single chat message.
Most offensive testers still recon AI systems like traditional web apps: port 443, a login page, maybe a /api prefix. That misses the point.
An AI-facing target is not one application. It is an ecosystem — orchestration layers, RAG pipelines, vector stores, inference APIs, and often multiple agents sharing the same codebase on high, non-standard ports. The footprints are different. The recon should be too.
This article walks through that recon: seven passive techniques, one lab target, one open-source tool (SpyAI), and a full attack-surface map produced before anyone typed a prompt.
The Thesis
You can learn more about an AI system from its infrastructure than from its chat window.
Swagger docs expose debug endpoints. /health JSON reveals agent names. OpenAPI schemas list RAG ingestion paths. JavaScript on the chat UI confirms POST-only APIs that return 405 on a naive GET. All of this is observable without credentials, without fuzzing, and without talking to the model.
That is passive recon for AI — and it is the difference between spraying generic prompt injections and targeting the right agent, endpoint, and trust boundary.
Passive ≠ Silent
SpyAI does not send credentials, fuzz aggressively, or deliver exploit payloads. Every probe is read-only: TCP connect checks, HEAD requests, and GET calls against a curated wordlist.
But passive here does not mean invisible. SpyAI generates network traffic. Run it only on systems you own or have explicit permission to test. Unauthorized scanning may violate law or policy.
Mental model: read-only intent, active network presence.
Lab Setup & Disclaimer
All examples in this article are based on scans conducted within my personal Offensive Security lab environment. The IP addresses mentioned, such as 192.168.217.21, are private addresses (RFC1918) used solely within this isolated lab setup and are not accessible from the public internet. This article does not disclose any sensitive or proprietary information related to Offensive Security labs.
One command:
# Run the full passive recon pipeline against one authorized target.
./spy-ai 192.168.217.21# Run the full passive recon pipeline against one authorized target.
./spy-ai 192.168.217.21~66 seconds later, SpyAI produces a Markdown report and a full execution log. Here is what came back:
| Port | Service |
| ---- | ---------- |
| 22 | ssh |
| 5432 | postgresql |
| 8001 | http |
| 8002 | http |
| 8003 | http |
| 8011 | http |
| 8012 | http || Port | Service |
| ---- | ---------- |
| 22 | ssh |
| 5432 | postgresql |
| 8001 | http |
| 8002 | http |
| 8003 | http |
| 8011 | http |
| 8012 | http |Seven ports. Five in the AI-likely range. One PostgreSQL backend. Zero chat messages sent.
That pattern — multiple high ports + a relational database — already suggests a multi-agent deployment with persistent storage, likely RAG or session state. The architecture is taking shape before technique one even runs.
The Seven Techniques
1. Parallel TCP Connect on AI-Likely Ports
Generic scanners stop at 22, 80, 443. AI stacks live elsewhere: inference APIs on 8000–8080, Ollama on 11434, Elasticsearch on 9200, Kibana on 5601, MinIO on 9000/9001.
SpyAI parallelizes TCP connect probes across 106 ports using a stdlib-only Python scanner — no nmap dependency:
# Returns True if a TCP handshake to host:port succeeds within the timeout.
def try_port(host: str, port: int, timeout: float) -> bool:
try:
with socket.create_connection((host, port), timeout=timeout):
return True
except (OSError, socket.timeout):
return False
# Probes every port in parallel and returns the sorted list of open ones.
def scan(host: str, ports: Sequence[int], timeout: float, workers: int) -> List[int]:
open_ports: List[int] = []
with concurrent.futures.ThreadPoolExecutor(max_workers=workers) as pool:
futures = {pool.submit(try_port, host, p, timeout): p for p in ports}
for fut in concurrent.futures.as_completed(futures):
port = futures[fut]
if fut.result():
open_ports.append(port)
return sorted(open_ports)# Returns True if a TCP handshake to host:port succeeds within the timeout.
def try_port(host: str, port: int, timeout: float) -> bool:
try:
with socket.create_connection((host, port), timeout=timeout):
return True
except (OSError, socket.timeout):
return False
# Probes every port in parallel and returns the sorted list of open ones.
def scan(host: str, ports: Sequence[int], timeout: float, workers: int) -> List[int]:
open_ports: List[int] = []
with concurrent.futures.ThreadPoolExecutor(max_workers=workers) as pool:
futures = {pool.submit(try_port, host, p, timeout): p for p in ports}
for fut in concurrent.futures.as_completed(futures):
port = futures[fut]
if fut.result():
open_ports.append(port)
return sorted(open_ports)Ports in 8000–8080 are labeled ai-likely. Known services get explicit hints: 11434 → Ollama, 9200 → Elasticsearch, 5601 → Kibana.
Live output:
# Scan 106 common + AI-likely ports; 7 came back open.
$ python3 lib/scan_ports.py 192.168.217.21 -p 22,25,53,80,...,11434,27017
scanning 106 port(s) on 192.168.217.21 ...
open: 7
22 5432 8001 8002 8003 8011 8012# Scan 106 common + AI-likely ports; 7 came back open.
$ python3 lib/scan_ports.py 192.168.217.21 -p 22,25,53,80,...,11434,27017
scanning 106 port(s) on 192.168.217.21 ...
open: 7
22 5432 8001 8002 8003 8011 8012Takeaway: Five consecutive AI-likely ports on one host is not a single chatbot. It is a fleet.
2. HTTP Banner Fingerprinting (HEAD /)
Once HTTP-likely ports are open, SpyAI sends HEAD / — headers only, no body.
# HEAD request for headers only; the response fingerprints the framework.
$ curl -sS -k -I http://192.168.217.21:8001/
HTTP/1.1 405 Method Not Allowed
server: uvicorn
allow: GET
content-type: application/json# HEAD request for headers only; the response fingerprints the framework.
$ curl -sS -k -I http://192.168.217.21:8001/
HTTP/1.1 405 Method Not Allowed
server: uvicorn
allow: GET
content-type: application/json405 + server: uvicorn + allow: GET is classic FastAPI/Starlette: the root serves a chat UI via GET, not HEAD. Framework confirmed before authentication.
All five agent ports returned the same signature. Same stack, multiple instances.
SpyAI also captures custom headers when present — X-AI-Backend, X-RAG-Provider, X-Model, X-Inference-Engine — which can fingerprint the inference provider or RAG backend without touching application logic.
Takeaway: A 405 is information, not a dead end.
3. Curated Path Enumeration
Port scan tells you where. Path enumeration tells you what.
SpyAI probes 34 paths per open HTTP port from a wordlist grouped by ecosystem:
| Category | Paths |
| --------------------|-----------------------------------------------|
| FastAPI / Swagger | `/openapi.json`, `/docs`, `/redoc`, `/health` |
| OpenAI-compatible | `/v1/models`, `/v1/chat/completions` |
| Object storage | `/minio/health/live`, `/minio/health/ready` |
| K8s / observability | `/-/health`, `/-/metrics`, `/metrics` || Category | Paths |
| --------------------|-----------------------------------------------|
| FastAPI / Swagger | `/openapi.json`, `/docs`, `/redoc`, `/health` |
| OpenAI-compatible | `/v1/models`, `/v1/chat/completions` |
| Object storage | `/minio/health/live`, `/minio/health/ready` |
| K8s / observability | `/-/health`, `/-/metrics`, `/metrics` |Hits on port 8001:
| Path | Status | Notes |
| --------------- | ------ | ----------------------------- |
| `/` | 200 | title="IT Helpdesk Assistant" |
| `/docs` | 200 | FastAPI Swagger UI |
| `/redoc` | 200 | ReDoc |
| `/openapi.json` | 200 | 4,906 bytes |
| `/health` | 200 | agent identity (below) |
| `/chat` | 405 | `allow: POST` |
# Unauthenticated GET on /health leaks the agent's name and bound port.
$ curl http://192.168.217.21:8001/health
{"status":"healthy","agent":"IT Helpdesk Assistant","port":8001}| Path | Status | Notes |
| --------------- | ------ | ----------------------------- |
| `/` | 200 | title="IT Helpdesk Assistant" |
| `/docs` | 200 | FastAPI Swagger UI |
| `/redoc` | 200 | ReDoc |
| `/openapi.json` | 200 | 4,906 bytes |
| `/health` | 200 | agent identity (below) |
| `/chat` | 405 | `allow: POST` |
# Unauthenticated GET on /health leaks the agent's name and bound port.
$ curl http://192.168.217.21:8001/health
{"status":"healthy","agent":"IT Helpdesk Assistant","port":8001}One unauthenticated request: agent name, health status, bound port.
Takeaway: /health is often the fastest agent inventory on the board.
4. HTML/JS Mining
Documented endpoints live in OpenAPI. Undocumented ones hide in frontend JavaScript.
SpyAI parses / for fetch() calls, axios usage, data-endpoint attributes, and common config keys (apiUrl, endpoint, baseUrl).
Extracted from port 8001:
// Frontend JS that reveals the real /chat contract: POST + JSON in, session_id out.
const r = await fetch('/chat', {
method: 'POST',
headers: {'Content-Type': 'application/json'},
body: JSON.stringify(body)
});
const d = await r.json();
sid = d.session_id;// Frontend JS that reveals the real /chat contract: POST + JSON in, session_id out.
const r = await fetch('/chat', {
method: 'POST',
headers: {'Content-Type': 'application/json'},
body: JSON.stringify(body)
});
const d = await r.json();
sid = d.session_id;Three findings, zero OpenAPI required:
/chatis POST-only with a JSON body (message, optionalsession_id).- Responses carry
session_idandresponse. - Session state is client-managed — relevant for fixation and context manipulation later.
The bare GET on /chat returned 405. Easy to miss without JS mining.
Takeaway: The chat UI is a recon source, not just an attack surface.
5. OpenAPI Parsing
When /openapi.json returns 200, SpyAI extracts every path and HTTP method.
Port 8001–16 paths:
# Every route and HTTP method parsed straight from the exposed /openapi.json.
GET / Chat Ui
POST /chat Chat
POST /browse Browse
POST /upload Upload
POST /summarize Summarize Docs
POST /kb/add Kb Add
GET /kb/search Kb Search Endpoint
GET /kb/topics Kb Topics
GET /debug/db-schema Debug Schema
GET /debug/query Debug Query
GET /logs/last-tool-call Last Tool Call
GET /logs/latest Logs Latest
POST /session/new New Session
POST /reset Reset
POST /review Review
GET /health Health# Every route and HTTP method parsed straight from the exposed /openapi.json.
GET / Chat Ui
POST /chat Chat
POST /browse Browse
POST /upload Upload
POST /summarize Summarize Docs
POST /kb/add Kb Add
GET /kb/search Kb Search Endpoint
GET /kb/topics Kb Topics
GET /debug/db-schema Debug Schema
GET /debug/query Debug Query
GET /logs/last-tool-call Last Tool Call
GET /logs/latest Logs Latest
POST /session/new New Session
POST /reset Reset
POST /review Review
GET /health HealthOffensive priorities jump out immediately:
The same 16-path schema appeared on all five agent ports. One codebase, five agent personas, one shared attack surface.
Takeaway: OpenAPI is the blueprint. Treat exposed Swagger like exposed admin panels.
6. Agent Health Detection
SpyAI flags JSON /health responses containing an agent field — a common pattern in uvicorn-based LLM deployments.
Full agent inventory from one scan:
| Port | Agent |
| ---- | ------------------------------ |
| 8001 | IT Helpdesk Assistant |
| 8002 | Secure IT Assistant |
| 8003 | Knowledge Base Assistant |
| 8011 | Secure Engineering Portal |
| 8012 | Department Resources Assistant || Port | Agent |
| ---- | ------------------------------ |
| 8001 | IT Helpdesk Assistant |
| 8002 | Secure IT Assistant |
| 8003 | Knowledge Base Assistant |
| 8011 | Secure Engineering Portal |
| 8012 | Department Resources Assistant |Five personas, one host. Names imply different trust boundaries and data scopes — an IT helpdesk agent likely sees different context than a "Secure Engineering Portal." That mapping drives targeted prompt injection and poisoned-document placement in the next phase.
Takeaway: Agent names are scope indicators, not labels.
7. TLS Certificate Mining
On HTTPS ports, SpyAI runs openssl s_client to extract subject, issuer, validity, and SANs.
This lab target had no HTTPS in the run — all agents served plain HTTP. When TLS is present, SANs often reveal internal hostnames, staging domains, and related infrastructure invisible to DNS alone. SpyAI runs this step automatically on 443, 8443, and other HTTPS-likely ports.
Takeaway: Certs are passive OSINT sitting on every HTTPS port.
The Full Picture (66 Seconds)
After all seven techniques, here is the complete surface map for port 8001:
| Path | Status | What it signals |
| --------------- | ------ | ----------------------------------------------- |
| `/` | 200 | Chat UI — JS mining target |
| `/docs` | 200 | Interactive Swagger — attack surface on its own |
| `/redoc` | 200 | API documentation |
| `/openapi.json` | 200 | Full schema — 16 paths |
| `/health` | 200 | Agent identity |
| `/chat` | 405 | POST-only API — confirmed via JS || Path | Status | What it signals |
| --------------- | ------ | ----------------------------------------------- |
| `/` | 200 | Chat UI — JS mining target |
| `/docs` | 200 | Interactive Swagger — attack surface on its own |
| `/redoc` | 200 | API documentation |
| `/openapi.json` | 200 | Full schema — 16 paths |
| `/health` | 200 | Agent identity |
| `/chat` | 405 | POST-only API — confirmed via JS |Before any active testing, a pentester knows:
- Stack: FastAPI + uvicorn (Python).
- Architecture: Multi-agent, shared codebase, PostgreSQL backend.
- RAG surface:
/kb/add,/kb/search,/upload,/summarize. - Debug exposure:
/debug/db-schema,/debug/query. - Agent tooling:
/logs/last-tool-call— potential MCP/RAG trace leakage. - Cost: 106 TCP connects and under 200 HTTP requests (1 HEAD + 34 GET per HTTP port, plus health/embedded probes). No credentials. No exploits.
That is the recon ROI: a precise map in under a minute.
Framework Mapping
Passive recon does not exploit findings — it maps them. That map is what makes the next phase surgical instead of noisy.
| Technique | What SpyAI surfaces |
| --------------------------------- | ------------------------------------ |
| Active Scanning | TCP connect on AI-likely port ranges |
| Discover AI Artifacts | OpenAPI, Swagger, agent `/health` |
| Exfiltration via AI Inference API | /kb/search, /debug/query, /logs/* || Technique | What SpyAI surfaces |
| --------------------------------- | ------------------------------------ |
| Active Scanning | TCP connect on AI-likely port ranges |
| Discover AI Artifacts | OpenAPI, Swagger, agent `/health` |
| Exfiltration via AI Inference API | /kb/search, /debug/query, /logs/* |- AML.T0006 Active Scanning
- AML.T0007 Discover AI Artifacts
- AML.T0024 Exfiltration via AI Inference API
| Risk | Recon |
| -------------------------------- | -------------------------------------------------------------------- |
| Sensitive Information Disclosure | `/debug/db-schema`, `/debug/query`, `/logs/latest` without auth |
| Excessive Agency | `/logs/last-tool-call`; agent names imply different privilege scopes |
| Vector and Embedding Weaknesses | `/kb/add`, `/kb/search`, `/upload` map RAG ingestion and retrieval || Risk | Recon |
| -------------------------------- | -------------------------------------------------------------------- |
| Sensitive Information Disclosure | `/debug/db-schema`, `/debug/query`, `/logs/latest` without auth |
| Excessive Agency | `/logs/last-tool-call`; agent names imply different privilege scopes |
| Vector and Embedding Weaknesses | `/kb/add`, `/kb/search`, `/upload` map RAG ingestion and retrieval |Scope Boundaries
SpyAI is the first pass — intentionally narrow:
- No version detection (
-sV). Run nmap separately for exact versions. - No fuzzing. The wordlist is curated, not exhaustive.
- No authentication or exploit delivery.
- Dependencies:
python3,curl, optionalopenssl. That is it.
What Comes Next
Passive recon answers: what is deployed, where, and what does it expose?
Active recon answers: how does it behave under pressure?
The transition point is this map. You now know which agent to target, which debug endpoint to probe, which RAG path to poison — before the first prompt lands.
If you are building AI red team capability, start here. Map the stack. Then prompt.
This article is Chapter 1 of an AI recon series — passive mapping before any model interaction. Chapter 2 covers active recon (contradiction testing, knowledge cutoff, citation forcing). The goal is a complete playbook built from real lab work.
Try It Yourself
# Clone, make the scripts executable, then point SpyAI at your own target.
git clone https://github.com/jmessiass/spy-ai.git
cd spy-ai
chmod +x spy-ai lib/scan_ports.py
./spy-ai <your-authorized-target># Clone, make the scripts executable, then point SpyAI at your own target.
git clone https://github.com/jmessiass/spy-ai.git
cd spy-ai
chmod +x spy-ai lib/scan_ports.py
./spy-ai <your-authorized-target>Reports land in output/<target>-<N>.md with a full execution log. Wordlists in data/ are community-extensible — open an issue or PR if you find paths worth adding.
That's all folks!
Repository: https://github.com/jmessiass/spy-ai
References
- MITRE ATLAS — Adversarial Threat Landscape for Artificial-Intelligence Systems
- MITRE ATLAS — Reconnaissance tactic (AML.TA0002)
- AML.T0006 Active Scanning
- AML.T0007 Discover AI Artifacts
- AML.T0024 Exfiltration via AI Inference API
- OWASP Top 10 for LLM Applications (2025) — OWASP GenAI Security Project
- LLM02:2025 Sensitive Information Disclosure
- LLM06:2025 Excessive Agency
- LLM08:2025 Vector and Embedding Weaknesses
- OpenAPI Specification
Tags: AI Security, LLM Security, Red Team, Reconnaissance, OffSec, Passive Recon, FastAPI, RAG, MITRE ATLAS, OWASP LLM
Lab disclaimer: All examples are based on scans within my personal Offensive Security lab environment. IP addresses such as 192.168.217.21 are private (RFC1918), used solely in this isolated setup, and are not accessible from the public internet. This article does not disclose sensitive or proprietary information related to Offensive Security labs.
Legal: Run SpyAI only on systems you are authorized to test. Unauthorized scanning may violate applicable law.