Reconnaissance & Information Gathering: The Foundation of Every VAPT Engagement

What Is Reconnaissance?

Before a penetration tester ever touches a target system, there's a critical phase that happens first: reconnaissance, often shortened to recon. It's the process of collecting as much information as possible about a target — their infrastructure, employees, domains, services, and digital footprint — before any active testing begins.

Think of it like a detective case. You don't walk into a suspect's house without knowing their schedule, routines, and entry points first. Recon is exactly that — building your map before you navigate.

In the VAPT (Vulnerability Assessment and Penetration Testing) lifecycle, reconnaissance is Phase 1. Everything that comes after — scanning, exploitation, post-exploitation — depends entirely on how thorough and accurate your recon was.

Why It Matters

Poor recon = missed attack surfaces = incomplete assessments.

If you don't know about a forgotten subdomain, an exposed employee email, or a misconfigured internet-facing service during recon, you simply won't test it. That blind spot could be the exact thing a real attacker would exploit.

Good reconnaissance:

Reveals the true scope of the target's digital exposure
Helps prioritize where to focus scanning and testing efforts
Reduces noise by giving context to findings
Helps avoid tripping alarms by understanding what's out there before touching it

Passive vs Active Recon

This is one of the most important distinctions in this phase.

Passive reconnaissance means gathering information without directly interacting with the target's systems. You're using third-party sources — public databases, search engines, social media, DNS records — and the target has no way to detect your activity.

Examples: Looking up WHOIS records, searching LinkedIn for employees, using Shodan to find exposed services.

Active reconnaissance involves directly interacting with the target — sending packets, making requests, or querying systems. This leaves traces in logs and can trigger alerts.

Examples: Running Nmap against a target's IP range, enumerating DNS via zone transfers, actively crawling a web application.

During a legitimate VAPT engagement, both types are used — but passive always comes first, and active recon is done within the approved scope window.

OSINT: Open Source Intelligence

OSINT is the art of gathering intelligence from publicly available sources. It's a cornerstone of passive recon.

Common OSINT sources include:

Search engines — Google, Bing, Shodan (for internet-connected devices)
Social media — LinkedIn for employee names, roles, and tech stacks; Twitter/X for inadvertent info leaks
Public records — WHOIS, DNS registrars, certificate transparency logs
Job postings — These often reveal what technologies a company uses internally
GitHub and code repositories — Developers sometimes accidentally push credentials or internal URLs

A solid OSINT pass can tell you the company's email format, internal technology stack, names of IT staff, subdomains, and even software versions — all without touching their network once.

Footprinting

Footprinting is the structured process of mapping out a target's digital footprint. It feeds directly from OSINT and covers:

Domain and subdomain enumeration — What domains does this organization own? What subdomains are exposed?
IP range identification — What IP addresses does the organization control?
Email harvesting — What employee emails are publicly discoverable?
Technology fingerprinting — What web server, CMS, or cloud provider are they using?
Network topology mapping — What can we infer about how their infrastructure is connected?

Footprinting bridges the gap between raw data collection (OSINT) and targeted scanning.

Key Tools in This Phase

theHarvester

A straightforward but powerful tool for harvesting emails, subdomains, IPs, and URLs from public sources. It queries engines like Google, Bing, LinkedIn, and VirusTotal simultaneously.

theHarvester -d target.com -b google,linkedin

theHarvester -d target.com -b google,linkedin

Best used early — it gives you a quick overview of an organization's publicly exposed digital assets.

Maltego

A visual intelligence tool that maps relationships between entities — domains, emails, people, IP addresses, social media profiles, and more. It uses transforms (automated queries against data sources) to build a web of connections.

Where theHarvester gives you a list, Maltego gives you a graph. It's especially powerful for visualizing how employees, domains, and infrastructure interconnect — useful in corporate and enterprise engagements.

Shodan

Often called "the search engine for hackers," Shodan indexes internet-connected devices — routers, webcams, industrial control systems, databases, and servers — along with their banners, ports, and service versions.

A simple Shodan search for a target's IP range can reveal:

Open ports and services
SSL certificate metadata
Software versions
Default credentials left on exposed panels

org:"Target Company Name"

org:"Target Company Name"

This is passive recon at its most powerful — you're querying Shodan's existing crawl data, never touching the target directly.

Recon-ng

A full-featured reconnaissance framework built in Python, modeled after Metasploit's workflow. It uses modules to query different data sources — WHOIS, DNS, LinkedIn, breach databases, and more — and stores all results in a local database for structured analysis.

recon-ng
[recon-ng] > marketplace install all
[recon-ng] > workspaces create target_engagement
[recon-ng] > modules load recon/domains-hosts/hackertarget
[recon-ng] > options set SOURCE target.com
[recon-ng] > run

recon-ng
[recon-ng] > marketplace install all
[recon-ng] > workspaces create target_engagement
[recon-ng] > modules load recon/domains-hosts/hackertarget
[recon-ng] > options set SOURCE target.com
[recon-ng] > run

Recon-ng is ideal when you want a repeatable, documented recon workflow — especially useful for reporting purposes.

The MITRE ATT&CK Connection

Reconnaissance maps directly to the Reconnaissance tactic in the MITRE ATT&CK framework (TA0043). Key techniques include:

Mapping your recon activities to ATT&CK is important for writing accurate VAPT reports — it gives clients a standardized reference point for what was done and what risks it relates to.

Recon in the Real World: The Flow

Define Scope
     ↓
Passive OSINT (Shodan, theHarvester, LinkedIn, WHOIS)
     ↓
Footprinting (subdomains, email format, IP ranges)
     ↓
Visual Mapping (Maltego)
     ↓
Structured Analysis (Recon-ng)
     ↓
Active Recon (only within approved scope)
     ↓
Feed findings into Phase 2: Scanning & Enumeration

Define Scope
     ↓
Passive OSINT (Shodan, theHarvester, LinkedIn, WHOIS)
     ↓
Footprinting (subdomains, email format, IP ranges)
     ↓
Visual Mapping (Maltego)
     ↓
Structured Analysis (Recon-ng)
     ↓
Active Recon (only within approved scope)
     ↓
Feed findings into Phase 2: Scanning & Enumeration

Key Takeaways

Reconnaissance is the foundation — weak recon leads to incomplete testing
Always begin passively; only go active when authorized
OSINT can reveal a surprising amount about any organization's attack surface
Tools like theHarvester, Shodan, Maltego, and Recon-ng each serve a distinct purpose — use them together, not in isolation
Document everything and map it to MITRE ATT&CK for professional reporting

This is Part 1 of a VAPT lifecycle series. The next post covers scanning and enumeration — where we take the map we just built and start probing it.

Contents