Most security researchers and bug bounty hunters hit the same wall after recon: the data. When you're working against a large target and your spider returns 50,000 or 100,000 URLs, the instinct is to start scanning immediately. That instinct is wrong, and it's where most hunters lose the edge that separates a $200 duplicate from a $5,000 logic flaw.
This article lays out a structured pathway from raw recon output to purposeful, hypothesis-driven security testing; the methodology I use personally when working large targets.
The Real Problem Isn't the Data Volume
Before we talk about filtering and tooling, understand what the actual problem is.
The problem isn't that you have too many URLs. The problem is that URLs without context are meaningless. A list of 100,000 paths tells you nothing about what the application does, what assumptions the developers made, or where those assumptions break. Hunters who jump from URL list to Burp Suite are essentially scanning blindly. They might find something, but they won't find the business logic flaws that pay well.
The mental model shift you need to make is this: your recon output is raw material for building an understanding of the application, not a queue of targets to attack.
Step 1: Understand the Target in One Sentence
Before touching your URL list, write one sentence describing what the target does. Not a paragraph, one sentence.
UberEats is a mobile platform that connects customers with local restaurants and couriers to order, track, and receive food and grocery deliveries.
This sounds trivial. It isn't. That single sentence anchors every decision you make downstream. It tells you who the stakeholders are (customers, restaurants, couriers), what the core value exchange is (money for food), and what features must therefore exist (ordering, payment, delivery tracking, ratings). You now have a mental skeleton before you've read a single URL.
Step 2: Filter Your URL List Before Doing Anything Else
Raw spider output is polluted with noise. Static assets; images, fonts, stylesheets, sourcemaps make up the bulk of most URL lists and contribute nothing to security testing. Before you do any analysis, strip them out.
Pass 1 — Remove static assets (keep JS separate):
# Strip non-JS static assets
grep -Eiv '\.(css|png|jpg|jpeg|gif|svg|ico|woff|woff2|ttf|eot|otf|mp4|webp|pdf|zip|map)(\?.*)?$' \
allurls.txt > clean1.txt
# Separate JS files into their own pile
grep -Ei '\.js(\?.*)?$' clean1.txt > jsfiles.txt
grep -Eiv '\.js(\?.*)?$' clean1.txt > dynamicurls.txt
Do not throw away your JS files. They go into a separate pile for analysis. Bundled JavaScript regularly exposes API endpoints, hardcoded tokens, and feature flag logic that never appears in spidered URLs.
Pass 2 — Keep only URLs with input signals:
Not every remaining URL is worth analyzing. A URL is interesting if it carries an input signal: a query parameter, a dynamic path segment, an action verb, or an API versioning pattern. URLs that are purely informational pages with no input surface can be deprioritized.
ACTION_VERBS='create|update|delete|edit|submit|apply|validate|verify|confirm|cancel|reset|search|upload|checkout|pay|refund|invite|subscribe|activate|approve|reject|transfer|redeem|report|assign|connect|disconnect'
{
grep '\?' dynamicurls.txt
grep -E '/[0–9]{1,}(/|$|\?)' dynamicurls.txt
grep -E '/[a-f0–9]{8}-[a-f0–9]{4}-[a-f0–9]{4}-[a-f0–9]{4}-[a-f0–9]{12}' dynamicurls.txt
grep -Ei "/($ACTION_VERBS)(/|$|\?)" dynamicurls.txt
grep -Ei '/(api|v[0–9]+|graphql|rest)/' dynamicurls.txt
} | sort -u > clean2.txt
Pass 3 — Normalize dynamic segments:
Hundreds of URLs like /orders/1001, /orders/1002, /orders/1003 represent one endpoint, not three hundred. Collapse them:
sed -E 's|/[0–9]+/|/{id}/|g' clean2.txt \
| sed -E 's|/[0–9]+$|/{id}|g' \
| sed -E 's|/[a-f0–9]{8}-[a-f0–9]{4}-[a-f0–9]{4}-[a-f0–9]{4}-[a-f0–9]{12}|/{uuid}|g' \
| sort -u > normalized.txt
What you now have in normalized.txt is a clean, deduplicated map of your actual attack surface typically 200 to 500 unique path shapes from a corpus that started in the tens of thousands.
Step 3: Feature Mapping — The Core of the Methodology
This is where the methodology separates itself from conventional approaches. Instead of treating your URL list as a flat queue, you group endpoints under the features they belong to.
A feature is something the application is, named the way a product manager would name it not a description of what an endpoint does. Examples: Order Lifecycle, Group Ordering, Subscription Management, Merchant Dashboard, Payment Processing.
Why features and not functionalities? Because vulnerabilities rarely live in an endpoint. They live between features, in the assumptions one feature makes about another. A refund endpoint by itself may be perfectly secure. A refund endpoint combined with a cancellation endpoint that doesn't update the same order state is where the money is.
The prompt to use with an AI assistant:
When your normalized URL list is ready, use the following prompt. Claude works best for this task due to its ability to reason about business logic rather than just pattern-match on paths.
Here is a list of spidered URLs from [application name].
[application description in one sentence]
Apply the feature mapping methodology:
- Clean and deduplicate
- Identify features (name them as a product manager would)
- Group endpoints under their feature
- Reconstruct the happy path flow for each feature
- Note the trust assumptions each feature makes
- Flag anomalies
- Output 2–3 vulnerability hypotheses per feature tied to specific endpoints
Additional context:
- Known user roles: [e.g. customer, restaurant, delivery partner, admin]
- Application type: [web app / mobile API / both]
- Known scope restrictions: [if any]
The richer your context, the sharper the hypotheses. Telling the model that a restaurantUUID appears as a query parameter on every merchant dashboard endpoint rather than being derived from the authenticated session - will produce a specific IDOR hypothesis rather than a generic one.
Step 4: What a Good Feature Map Looks Like
Each feature in your map should contain five things:
Endpoints — the specific paths and methods belonging to this feature.
Happy path flow — the sequence of steps a legitimate user takes through this feature from entry to completion. Write it out. Browse → Add to cart → Apply promo code → Enter payment → Place order → Track delivery.
State transitions — what changes in the application at each step. An order goes from draft to submitted to confirmed to in-transit to delivered. Knowing the state machine tells you which transitions can be skipped, reversed, or repeated.
Trust assumptions — what each step assumes is already true when a request arrives. The checkout endpoint assumes the promo code was already validated. The refund endpoint assumes the order was actually delivered. These assumptions are where business logic flaws live.
Anomalies — paths that don't fit neatly under any feature, endpoints whose naming implies a design inconsistency, version mismatches (V1 and V2 of the same endpoint coexisting), or internal-sounding paths resolving publicly. Anomalies are your first-priority test targets not because they're automatically vulnerable, but because they represent places where the developer's mental model was inconsistent.
Step 5: Form Hypotheses Before Opening Burp
Nothing from your feature map goes into Burp without a hypothesis attached to it. An endpoint without a hypothesis is just noise.
A hypothesis looks like this:
getCartsViewForEaterUuidV1 - the endpoint name suggests the eaterUUID is supplied in the POST body. If the server accepts an arbitrary UUID without verifying it matches the authenticated session, this is a POST-body IDOR. Test: intercept the request, substitute the eaterUUID with a UUID from a second test account, observe whether the response returns that account's cart data.
Notice what that hypothesis contains: the specific endpoint, the reasoning behind the suspicion, the exact test action, and the observable outcome that would confirm the vulnerability. That's a testable, documentable, reproducible hypothesis not a hunch.
The hunters who write these hypotheses before they touch Burp are the ones who find logic flaws.
A Note on Burp Proxy Data
Once you have your feature map and hypotheses, your Burp session becomes far more productive. But Burp proxy logs also feed back into your feature map in ways that spidered URLs cannot.
Spidered URLs give you path shapes. Burp proxy data gives you the actual API contract response sizes, call frequency, temporal sequencing, and the version of each endpoint being called. From Burp data alone you can identify:
- Response size outliers — a single call to getCheckoutPresentation returning 54,000 bytes when every other call returns under 6,000 suggests the server returned a fundamentally different dataset in that request. That discrepancy is worth investigating before anything else.
- Atomic check-and-mutate endpoints — an endpoint named checkAndUpdateGratisV1 tells you that an eligibility check and a state mutation happen in the same call. No separate validation step means no window to interpose but it also means if the eligibility signal is client-controlled, the mutation fires immediately.
- Version coexistence — getDraftOrderByUuidV1 and getDraftOrderByUuidV2 existing in the same session means two code paths serve the same resource. Different versions often have different authorization implementations.
The methodology works on both data sources. The mental model is the same: understand the feature, reconstruct the flow, find the assumption, form the hypothesis, then test.
| Phase | What you're doing | Output |
| — -| — -| — -|
| 1 | One-sentence target description | Mental skeleton of the application |
| 2 | Filter and normalize URL list | normalized.txt — clean attack surface |
| 3 | Feature mapping with AI | Structured feature map with grouped endpoints |
| 4 | Flow reconstruction per feature | State machines and trust assumptions |
| 5 | Hypothesis formation | Burp test queue with specific, testable hypotheses |
The methodology is not about working faster. It's about working on the right things. Recon produces data. This process turns data into understanding. Understanding is what finds the bugs that matter.