Spy Before You Prompt: Passive Recon for LLM Agents, RAG and the AI Stack

How seven read-only techniques mapped five LLM agents, exposed Swagger, and flagged debug endpoints — in 66 seconds, without sending a single chat message.

Most offensive testers still recon AI systems like traditional web apps: port 443, a login page, maybe a /api prefix. That misses the point.

An AI-facing target is not one application. It is an ecosystem — orchestration layers, RAG pipelines, vector stores, inference APIs, and often multiple agents sharing the same codebase on high, non-standard ports. The footprints are different. The recon should be too.

This article walks through that recon: seven passive techniques, one lab target, one open-source tool (SpyAI), and a full attack-surface map produced before anyone typed a prompt.

The Thesis

You can learn more about an AI system from its infrastructure than from its chat window.

Swagger docs expose debug endpoints. /health JSON reveals agent names. OpenAPI schemas list RAG ingestion paths. JavaScript on the chat UI confirms POST-only APIs that return 405 on a naive GET. All of this is observable without credentials, without fuzzing, and without talking to the model.

That is passive recon for AI — and it is the difference between spraying generic prompt injections and targeting the right agent, endpoint, and trust boundary.

Passive ≠ Silent

SpyAI does not send credentials, fuzz aggressively, or deliver exploit payloads. Every probe is read-only: TCP connect checks, HEAD requests, and GET calls against a curated wordlist.

But passive here does not mean invisible. SpyAI generates network traffic. Run it only on systems you own or have explicit permission to test. Unauthorized scanning may violate law or policy.

Mental model: read-only intent, active network presence.

Lab Setup & Disclaimer

All examples in this article are based on scans conducted within my personal Offensive Security lab environment. The IP addresses mentioned, such as 192.168.217.21, are private addresses (RFC1918) used solely within this isolated lab setup and are not accessible from the public internet. This article does not disclose any sensitive or proprietary information related to Offensive Security labs.

One command:

# Run the full passive recon pipeline against one authorized target.
./spy-ai 192.168.217.21

# Run the full passive recon pipeline against one authorized target.
./spy-ai 192.168.217.21

~66 seconds later, SpyAI produces a Markdown report and a full execution log. Here is what came back:

| Port | Service    |
| ---- | ---------- |
| 22   | ssh        |
| 5432 | postgresql |
| 8001 | http       |
| 8002 | http       |
| 8003 | http       |
| 8011 | http       |
| 8012 | http       |

| Port | Service    |
| ---- | ---------- |
| 22   | ssh        |
| 5432 | postgresql |
| 8001 | http       |
| 8002 | http       |
| 8003 | http       |
| 8011 | http       |
| 8012 | http       |

Seven ports. Five in the AI-likely range. One PostgreSQL backend. Zero chat messages sent.

That pattern — multiple high ports + a relational database — already suggests a multi-agent deployment with persistent storage, likely RAG or session state. The architecture is taking shape before technique one even runs.

The Seven Techniques

1. Parallel TCP Connect on AI-Likely Ports

Generic scanners stop at 22, 80, 443. AI stacks live elsewhere: inference APIs on 8000–8080, Ollama on 11434, Elasticsearch on 9200, Kibana on 5601, MinIO on 9000/9001.

SpyAI parallelizes TCP connect probes across 106 ports using a stdlib-only Python scanner — no nmap dependency:

# Returns True if a TCP handshake to host:port succeeds within the timeout.
def try_port(host: str, port: int, timeout: float) -> bool:
    try:
        with socket.create_connection((host, port), timeout=timeout):
            return True
    except (OSError, socket.timeout):
        return False

# Probes every port in parallel and returns the sorted list of open ones.
def scan(host: str, ports: Sequence[int], timeout: float, workers: int) -> List[int]:
    open_ports: List[int] = []
    with concurrent.futures.ThreadPoolExecutor(max_workers=workers) as pool:
        futures = {pool.submit(try_port, host, p, timeout): p for p in ports}
        for fut in concurrent.futures.as_completed(futures):
            port = futures[fut]
            if fut.result():
                open_ports.append(port)
    return sorted(open_ports)

# Returns True if a TCP handshake to host:port succeeds within the timeout.
def try_port(host: str, port: int, timeout: float) -> bool:
    try:
        with socket.create_connection((host, port), timeout=timeout):
            return True
    except (OSError, socket.timeout):
        return False

# Probes every port in parallel and returns the sorted list of open ones.
def scan(host: str, ports: Sequence[int], timeout: float, workers: int) -> List[int]:
    open_ports: List[int] = []
    with concurrent.futures.ThreadPoolExecutor(max_workers=workers) as pool:
        futures = {pool.submit(try_port, host, p, timeout): p for p in ports}
        for fut in concurrent.futures.as_completed(futures):
            port = futures[fut]
            if fut.result():
                open_ports.append(port)
    return sorted(open_ports)

Ports in 8000–8080 are labeled ai-likely. Known services get explicit hints: 11434 → Ollama, 9200 → Elasticsearch, 5601 → Kibana.

Live output:

# Scan 106 common + AI-likely ports; 7 came back open.
$ python3 lib/scan_ports.py 192.168.217.21 -p 22,25,53,80,...,11434,27017
scanning 106 port(s) on 192.168.217.21 ...
open: 7
22 5432 8001 8002 8003 8011 8012

# Scan 106 common + AI-likely ports; 7 came back open.
$ python3 lib/scan_ports.py 192.168.217.21 -p 22,25,53,80,...,11434,27017
scanning 106 port(s) on 192.168.217.21 ...
open: 7
22 5432 8001 8002 8003 8011 8012

Takeaway: Five consecutive AI-likely ports on one host is not a single chatbot. It is a fleet.

2. HTTP Banner Fingerprinting (`HEAD /`)

Once HTTP-likely ports are open, SpyAI sends HEAD / — headers only, no body.

# HEAD request for headers only; the response fingerprints the framework.
$ curl -sS -k -I http://192.168.217.21:8001/
HTTP/1.1 405 Method Not Allowed
server: uvicorn
allow: GET
content-type: application/json

# HEAD request for headers only; the response fingerprints the framework.
$ curl -sS -k -I http://192.168.217.21:8001/
HTTP/1.1 405 Method Not Allowed
server: uvicorn
allow: GET
content-type: application/json

405 + server: uvicorn + allow: GET is classic FastAPI/Starlette: the root serves a chat UI via GET, not HEAD. Framework confirmed before authentication.

All five agent ports returned the same signature. Same stack, multiple instances.

SpyAI also captures custom headers when present — X-AI-Backend, X-RAG-Provider, X-Model, X-Inference-Engine — which can fingerprint the inference provider or RAG backend without touching application logic.

Takeaway: A 405 is information, not a dead end.

3. Curated Path Enumeration

Port scan tells you where. Path enumeration tells you what.

SpyAI probes 34 paths per open HTTP port from a wordlist grouped by ecosystem:

|       Category      |                        Paths                  |
| --------------------|-----------------------------------------------|
| FastAPI / Swagger   | `/openapi.json`, `/docs`, `/redoc`, `/health` |
| OpenAI-compatible   | `/v1/models`, `/v1/chat/completions`          |
| Object storage      | `/minio/health/live`, `/minio/health/ready`   |
| K8s / observability | `/-/health`, `/-/metrics`, `/metrics`         |

|       Category      |                        Paths                  |
| --------------------|-----------------------------------------------|
| FastAPI / Swagger   | `/openapi.json`, `/docs`, `/redoc`, `/health` |
| OpenAI-compatible   | `/v1/models`, `/v1/chat/completions`          |
| Object storage      | `/minio/health/live`, `/minio/health/ready`   |
| K8s / observability | `/-/health`, `/-/metrics`, `/metrics`         |

Hits on port 8001:

|       Path      | Status | Notes                         |
| --------------- | ------ | ----------------------------- |
| `/`             | 200    | title="IT Helpdesk Assistant" |
| `/docs`         | 200    | FastAPI Swagger UI            |
| `/redoc`        | 200    | ReDoc                         |
| `/openapi.json` | 200    | 4,906 bytes                   |
| `/health`       | 200    | agent identity (below)        |
| `/chat`         | 405    | `allow: POST`                 |
# Unauthenticated GET on /health leaks the agent's name and bound port.
$ curl http://192.168.217.21:8001/health
{"status":"healthy","agent":"IT Helpdesk Assistant","port":8001}

|       Path      | Status | Notes                         |
| --------------- | ------ | ----------------------------- |
| `/`             | 200    | title="IT Helpdesk Assistant" |
| `/docs`         | 200    | FastAPI Swagger UI            |
| `/redoc`        | 200    | ReDoc                         |
| `/openapi.json` | 200    | 4,906 bytes                   |
| `/health`       | 200    | agent identity (below)        |
| `/chat`         | 405    | `allow: POST`                 |
# Unauthenticated GET on /health leaks the agent's name and bound port.
$ curl http://192.168.217.21:8001/health
{"status":"healthy","agent":"IT Helpdesk Assistant","port":8001}

One unauthenticated request: agent name, health status, bound port.

Takeaway: /health is often the fastest agent inventory on the board.

4. HTML/JS Mining

Documented endpoints live in OpenAPI. Undocumented ones hide in frontend JavaScript.

SpyAI parses / for fetch() calls, axios usage, data-endpoint attributes, and common config keys (apiUrl, endpoint, baseUrl).

Extracted from port 8001:

// Frontend JS that reveals the real /chat contract: POST + JSON in, session_id out.
const r = await fetch('/chat', {
    method: 'POST',
    headers: {'Content-Type': 'application/json'},
    body: JSON.stringify(body)
});
const d = await r.json();
sid = d.session_id;

// Frontend JS that reveals the real /chat contract: POST + JSON in, session_id out.
const r = await fetch('/chat', {
    method: 'POST',
    headers: {'Content-Type': 'application/json'},
    body: JSON.stringify(body)
});
const d = await r.json();
sid = d.session_id;

Three findings, zero OpenAPI required:

/chat is POST-only with a JSON body (message, optional session_id).
Responses carry session_id and response.
Session state is client-managed — relevant for fixation and context manipulation later.

The bare GET on /chat returned 405. Easy to miss without JS mining.

Takeaway: The chat UI is a recon source, not just an attack surface.

5. OpenAPI Parsing

When /openapi.json returns 200, SpyAI extracts every path and HTTP method.

Port 8001–16 paths:

# Every route and HTTP method parsed straight from the exposed /openapi.json.
GET    /                                   Chat Ui
  POST   /chat                               Chat
  POST   /browse                             Browse
  POST   /upload                             Upload
  POST   /summarize                          Summarize Docs
  POST   /kb/add                             Kb Add
  GET    /kb/search                          Kb Search Endpoint
  GET    /kb/topics                          Kb Topics
  GET    /debug/db-schema                    Debug Schema
  GET    /debug/query                        Debug Query
  GET    /logs/last-tool-call                Last Tool Call
  GET    /logs/latest                        Logs Latest
  POST   /session/new                        New Session
  POST   /reset                              Reset
  POST   /review                             Review
  GET    /health                             Health

# Every route and HTTP method parsed straight from the exposed /openapi.json.
GET    /                                   Chat Ui
  POST   /chat                               Chat
  POST   /browse                             Browse
  POST   /upload                             Upload
  POST   /summarize                          Summarize Docs
  POST   /kb/add                             Kb Add
  GET    /kb/search                          Kb Search Endpoint
  GET    /kb/topics                          Kb Topics
  GET    /debug/db-schema                    Debug Schema
  GET    /debug/query                        Debug Query
  GET    /logs/last-tool-call                Last Tool Call
  GET    /logs/latest                        Logs Latest
  POST   /session/new                        New Session
  POST   /reset                              Reset
  POST   /review                             Review
  GET    /health                             Health

Offensive priorities jump out immediately:

The same 16-path schema appeared on all five agent ports. One codebase, five agent personas, one shared attack surface.

Takeaway: OpenAPI is the blueprint. Treat exposed Swagger like exposed admin panels.

6. Agent Health Detection

SpyAI flags JSON /health responses containing an agent field — a common pattern in uvicorn-based LLM deployments.

Full agent inventory from one scan:

| Port | Agent                          |
| ---- | ------------------------------ |
| 8001 | IT Helpdesk Assistant          |
| 8002 | Secure IT Assistant            |
| 8003 | Knowledge Base Assistant       |
| 8011 | Secure Engineering Portal      |
| 8012 | Department Resources Assistant |

| Port | Agent                          |
| ---- | ------------------------------ |
| 8001 | IT Helpdesk Assistant          |
| 8002 | Secure IT Assistant            |
| 8003 | Knowledge Base Assistant       |
| 8011 | Secure Engineering Portal      |
| 8012 | Department Resources Assistant |

Five personas, one host. Names imply different trust boundaries and data scopes — an IT helpdesk agent likely sees different context than a "Secure Engineering Portal." That mapping drives targeted prompt injection and poisoned-document placement in the next phase.

Takeaway: Agent names are scope indicators, not labels.

7. TLS Certificate Mining

On HTTPS ports, SpyAI runs openssl s_client to extract subject, issuer, validity, and SANs.

This lab target had no HTTPS in the run — all agents served plain HTTP. When TLS is present, SANs often reveal internal hostnames, staging domains, and related infrastructure invisible to DNS alone. SpyAI runs this step automatically on 443, 8443, and other HTTPS-likely ports.

Takeaway: Certs are passive OSINT sitting on every HTTPS port.

The Full Picture (66 Seconds)

After all seven techniques, here is the complete surface map for port 8001:

| Path            | Status | What it signals                                 |
| --------------- | ------ | ----------------------------------------------- |
| `/`             | 200    | Chat UI — JS mining target                      |
| `/docs`         | 200    | Interactive Swagger — attack surface on its own |
| `/redoc`        | 200    | API documentation                               |
| `/openapi.json` | 200    | Full schema — 16 paths                          |
| `/health`       | 200    | Agent identity                                  |
| `/chat`         | 405    | POST-only API — confirmed via JS                |

| Path            | Status | What it signals                                 |
| --------------- | ------ | ----------------------------------------------- |
| `/`             | 200    | Chat UI — JS mining target                      |
| `/docs`         | 200    | Interactive Swagger — attack surface on its own |
| `/redoc`        | 200    | API documentation                               |
| `/openapi.json` | 200    | Full schema — 16 paths                          |
| `/health`       | 200    | Agent identity                                  |
| `/chat`         | 405    | POST-only API — confirmed via JS                |

Before any active testing, a pentester knows:

Stack: FastAPI + uvicorn (Python).
Architecture: Multi-agent, shared codebase, PostgreSQL backend.
RAG surface: /kb/add, /kb/search, /upload, /summarize.
Debug exposure: /debug/db-schema, /debug/query.
Agent tooling: /logs/last-tool-call — potential MCP/RAG trace leakage.
Cost: 106 TCP connects and under 200 HTTP requests (1 HEAD + 34 GET per HTTP port, plus health/embedded probes). No credentials. No exploits.

That is the recon ROI: a precise map in under a minute.

Framework Mapping

Passive recon does not exploit findings — it maps them. That map is what makes the next phase surgical instead of noisy.

MITRE ATLAS

|             Technique             |         What SpyAI surfaces          |
| --------------------------------- | ------------------------------------ |
| Active Scanning                   | TCP connect on AI-likely port ranges |
| Discover AI Artifacts             | OpenAPI, Swagger, agent `/health`    |
| Exfiltration via AI Inference API | /kb/search, /debug/query, /logs/*    |

|             Technique             |         What SpyAI surfaces          |
| --------------------------------- | ------------------------------------ |
| Active Scanning                   | TCP connect on AI-likely port ranges |
| Discover AI Artifacts             | OpenAPI, Swagger, agent `/health`    |
| Exfiltration via AI Inference API | /kb/search, /debug/query, /logs/*    |

OWASP LLM Top 10 (2025)

|                   Risk           |            Recon                                                     |
| -------------------------------- | -------------------------------------------------------------------- |
| Sensitive Information Disclosure | `/debug/db-schema`, `/debug/query`, `/logs/latest` without auth      |
| Excessive Agency                 | `/logs/last-tool-call`; agent names imply different privilege scopes |
| Vector and Embedding Weaknesses  | `/kb/add`, `/kb/search`, `/upload` map RAG ingestion and retrieval   |

|                   Risk           |            Recon                                                     |
| -------------------------------- | -------------------------------------------------------------------- |
| Sensitive Information Disclosure | `/debug/db-schema`, `/debug/query`, `/logs/latest` without auth      |
| Excessive Agency                 | `/logs/last-tool-call`; agent names imply different privilege scopes |
| Vector and Embedding Weaknesses  | `/kb/add`, `/kb/search`, `/upload` map RAG ingestion and retrieval   |

Scope Boundaries

SpyAI is the first pass — intentionally narrow:

No version detection (-sV). Run nmap separately for exact versions.
No fuzzing. The wordlist is curated, not exhaustive.
No authentication or exploit delivery.
Dependencies: python3, curl, optional openssl. That is it.

What Comes Next

Passive recon answers: what is deployed, where, and what does it expose?

Active recon answers: how does it behave under pressure?

The transition point is this map. You now know which agent to target, which debug endpoint to probe, which RAG path to poison — before the first prompt lands.

If you are building AI red team capability, start here. Map the stack. Then prompt.

This article is Chapter 1 of an AI recon series — passive mapping before any model interaction. Chapter 2 covers active recon (contradiction testing, knowledge cutoff, citation forcing). The goal is a complete playbook built from real lab work.

Try It Yourself

# Clone, make the scripts executable, then point SpyAI at your own target.
git clone https://github.com/jmessiass/spy-ai.git
cd spy-ai
chmod +x spy-ai lib/scan_ports.py
./spy-ai <your-authorized-target>

# Clone, make the scripts executable, then point SpyAI at your own target.
git clone https://github.com/jmessiass/spy-ai.git
cd spy-ai
chmod +x spy-ai lib/scan_ports.py
./spy-ai <your-authorized-target>

Reports land in output/<target>-<N>.md with a full execution log. Wordlists in data/ are community-extensible — open an issue or PR if you find paths worth adding.

That's all folks!

Repository: https://github.com/jmessiass/spy-ai

References

MITRE ATLAS — Adversarial Threat Landscape for Artificial-Intelligence Systems
MITRE ATLAS — Reconnaissance tactic (AML.TA0002)
AML.T0006 Active Scanning
AML.T0007 Discover AI Artifacts
AML.T0024 Exfiltration via AI Inference API
OWASP Top 10 for LLM Applications (2025) — OWASP GenAI Security Project
LLM02:2025 Sensitive Information Disclosure
LLM06:2025 Excessive Agency
LLM08:2025 Vector and Embedding Weaknesses
OpenAPI Specification

Tags: AI Security, LLM Security, Red Team, Reconnaissance, OffSec, Passive Recon, FastAPI, RAG, MITRE ATLAS, OWASP LLM

Lab disclaimer: All examples are based on scans within my personal Offensive Security lab environment. IP addresses such as 192.168.217.21 are private (RFC1918), used solely in this isolated setup, and are not accessible from the public internet. This article does not disclose sensitive or proprietary information related to Offensive Security labs.

Legal: Run SpyAI only on systems you are authorized to test. Unauthorized scanning may violate applicable law.

Contents