The Reconnaissance Checklist I Wish I Learned Earlier

Read here.

When I first started learning reconnaissance, I treated it like a race.

Run a subdomain enumeration tool.

Scan a few ports.

Open interesting websites.

Look for vulnerabilities.

Repeat.

The workflow produced plenty of data.

Very little intelligence.

Over time, I realized that successful reconnaissance isn't about collecting the largest list of assets. It's about systematically reducing uncertainty until you understand how an organization's infrastructure is actually connected.

Every forgotten API.

Every abandoned subdomain.

Every exposed storage bucket.

Every development environment.

They all exist because they belong to someone's attack surface.

The goal of reconnaissance is to discover those relationships before an attacker — or another bug bounty hunter — does.

This is the checklist I wish I had learned much earlier.

Step 1: Define the Target, Not Just the Domain

Many beginners start with:

example.com

example.com

But modern organizations rarely consist of a single domain.

Instead, think in terms of assets.

Questions to answer first:

Which root domains are in scope?
Which acquisitions belong to the company?
Which regional domains exist?
Which cloud environments are publicly accessible?
Which brands share infrastructure?
Which APIs support mobile applications?
Which development environments are internet-facing?

Reconnaissance starts with understanding the organization — not the hostname.

Step 2: Build an Asset Inventory

Every discovered asset becomes another pivot point.

Typical asset categories include:

Domains
Subdomains
IP addresses
ASN allocations
DNS records
TLS certificates
Cloud storage
Source repositories
Public APIs
Container registries
Package repositories
Developer portals

Domains
Subdomains
IP addresses
ASN allocations
DNS records
TLS certificates
Cloud storage
Source repositories
Public APIs
Container registries
Package repositories
Developer portals

Treat reconnaissance like building an inventory database.

Every new asset should answer one question:

What else does this reveal?

Step 3: Enumerate Subdomains From Multiple Sources

Passive enumeration should always come before active scanning.

Different intelligence sources observe different portions of the internet.

Examples include:

Certificate Transparency logs
Passive DNS datasets
Search engine indexes
Public DNS archives
Internet-wide scan datasets
Security search engines

No single source is complete.

Combining multiple passive sources consistently produces broader coverage than relying on one tool.

The objective isn't to collect thousands of subdomains.

It's to discover the ones everyone else misses.

Step 4: Resolve Before You Scan

Large subdomain lists often contain:

Expired entries
Typographical errors
Internal hostnames
Decommissioned infrastructure
Parked domains

Before investing time in scanning, determine which assets actually resolve.

This reduces unnecessary requests while improving the quality of subsequent analysis.

A smaller list of verified hosts is significantly more valuable than a larger list filled with inactive entries.

Step 5: Identify Live Services

A hostname without a running service rarely deserves immediate attention.

Determine:

Which hosts respond over HTTP or HTTPS?
Which ports expose applications?
Which protocols are available?
Which redirects occur?
Which services return meaningful responses?

At this stage, you're mapping the organization's externally reachable infrastructure rather than searching for vulnerabilities.

Coverage is more important than depth.

Step 6: Fingerprint the Technology Stack

Every exposed application reveals implementation details.

Collect information such as:

Web server software
Reverse proxy
Programming language
Framework
CMS
JavaScript libraries
API technologies
Authentication mechanisms
CDN providers
Cloud platform indicators

Modern fingerprinting is less about identifying versions and more about understanding architecture.

Knowing an application uses GraphQL, serverless functions, Kubernetes ingress, or object storage immediately influences the next phase of testing.

Step 7: Enumerate Content, Not Just Endpoints

Many assessments stop after discovering:

/admin
/login
/api
/dashboard

/admin
/login
/api
/dashboard

Modern applications expose much richer attack surfaces.

Look for:

OpenAPI or Swagger specifications
GraphQL endpoints
Versioned REST APIs
Static JavaScript bundles
Source maps
Backup archives
Configuration files
Debug interfaces
Health check endpoints
Metrics endpoints
WebSocket services

Client-side JavaScript is particularly valuable because it frequently references undocumented APIs, internal routes, feature flags, and third-party integrations that aren't visible from the homepage.

Step 8: Analyze HTTP Behavior

The response itself often reveals more than the page content.

Pay attention to:

Headers

Server
Cache-Control
Content-Security-Policy
Access-Control-Allow-Origin
X-Powered-By
Strict-Transport-Security

Server
Cache-Control
Content-Security-Policy
Access-Control-Allow-Origin
X-Powered-By
Strict-Transport-Security

Secure
HttpOnly
SameSite
Domain
Path

Secure
HttpOnly
SameSite
Domain
Path

Responses

Redirect chains
Authentication flow
Compression
Caching behavior
Error messages

Redirect chains
Authentication flow
Compression
Caching behavior
Error messages

Small implementation details frequently expose architectural decisions that guide further testing.

Step 9: Look for Relationships Between Assets

One isolated host rarely tells the complete story.

Ask questions like:

Does the mobile API use different authentication?
Does staging trust production credentials?
Do multiple applications share the same session cookie?
Does one domain expose internal API endpoints used by another?
Do development environments reference production resources?

The most valuable findings often emerge from interactions between systems rather than weaknesses in a single application.

Step 10: Search for Forgotten Infrastructure

Organizations evolve.

Infrastructure rarely disappears as quickly as projects do.

Common examples include:

Legacy applications
Staging environments
QA servers
Development portals
Old VPN gateways
Temporary migration hosts
Deprecated API versions
Retired administrative interfaces

These systems often receive less maintenance while remaining publicly accessible.

Age frequently correlates with increased attack surface.

Step 11: Prioritize by Exposure, Not Curiosity

Not every discovered asset deserves equal attention.

Prioritize systems that exhibit characteristics such as:

Public authentication
Administrative functionality
File uploads
Payment workflows
Identity management
API gateways
Cloud administration
Object storage
Data export capabilities

Attack surface prioritization is an operational skill.

Finding one hundred applications is easy.

Knowing which five deserve investigation is considerably harder.

Step 12: Let Automation Handle Discovery, Not Thinking

Modern reconnaissance tools have become remarkably capable.

They can enumerate subdomains, validate hosts, fingerprint technologies, crawl applications, extract endpoints, and correlate internet-scale datasets in minutes.

That automation is invaluable.

But tools only answer the questions you ask them.

They do not decide:

Which asset looks unusual.
Which architecture appears inconsistent.
Which authentication boundary deserves closer examination.
Which exposed service represents the highest business risk.

Reconnaissance is still an analytical discipline.

Automation accelerates collection.

Human reasoning identifies opportunity.

The Most Important Lesson

Early in my learning journey, I believed reconnaissance was about accumulating more data than everyone else.

Today, I think about it differently.

Good reconnaissance produces a model of the target.

You understand:

What assets exist.
How they communicate.
Which technologies they rely on.
Where trust boundaries are established.
Which systems appear neglected.
Which components expose meaningful attack surface.

Only then does vulnerability research become efficient.

The best findings rarely come from scanning the largest number of hosts.

They come from understanding the relationships between a small number of assets that everyone else treated as unrelated.

The clue is usually not the individual subdomain.

It's the architecture that the subdomain quietly reveals.

Contents