June 26, 2026
Reconnaissance & Information Gathering: The Foundation of Every VAPT Engagement
The phase no one talks about enough — and why skipping it ruins everything downstream.
By R. Mahathi
4 min read
What Is Reconnaissance?
Before a penetration tester ever touches a target system, there's a critical phase that happens first: reconnaissance, often shortened to recon. It's the process of collecting as much information as possible about a target — their infrastructure, employees, domains, services, and digital footprint — before any active testing begins.
Think of it like a detective case. You don't walk into a suspect's house without knowing their schedule, routines, and entry points first. Recon is exactly that — building your map before you navigate.
In the VAPT (Vulnerability Assessment and Penetration Testing) lifecycle, reconnaissance is Phase 1. Everything that comes after — scanning, exploitation, post-exploitation — depends entirely on how thorough and accurate your recon was.
Why It Matters
Poor recon = missed attack surfaces = incomplete assessments.
If you don't know about a forgotten subdomain, an exposed employee email, or a misconfigured internet-facing service during recon, you simply won't test it. That blind spot could be the exact thing a real attacker would exploit.
Good reconnaissance:
- Reveals the true scope of the target's digital exposure
- Helps prioritize where to focus scanning and testing efforts
- Reduces noise by giving context to findings
- Helps avoid tripping alarms by understanding what's out there before touching it
Passive vs Active Recon
This is one of the most important distinctions in this phase.
Passive reconnaissance means gathering information without directly interacting with the target's systems. You're using third-party sources — public databases, search engines, social media, DNS records — and the target has no way to detect your activity.
Examples: Looking up WHOIS records, searching LinkedIn for employees, using Shodan to find exposed services.
Active reconnaissance involves directly interacting with the target — sending packets, making requests, or querying systems. This leaves traces in logs and can trigger alerts.
Examples: Running Nmap against a target's IP range, enumerating DNS via zone transfers, actively crawling a web application.
During a legitimate VAPT engagement, both types are used — but passive always comes first, and active recon is done within the approved scope window.
OSINT: Open Source Intelligence
OSINT is the art of gathering intelligence from publicly available sources. It's a cornerstone of passive recon.
Common OSINT sources include:
- Search engines — Google, Bing, Shodan (for internet-connected devices)
- Social media — LinkedIn for employee names, roles, and tech stacks; Twitter/X for inadvertent info leaks
- Public records — WHOIS, DNS registrars, certificate transparency logs
- Job postings — These often reveal what technologies a company uses internally
- GitHub and code repositories — Developers sometimes accidentally push credentials or internal URLs
A solid OSINT pass can tell you the company's email format, internal technology stack, names of IT staff, subdomains, and even software versions — all without touching their network once.
Footprinting
Footprinting is the structured process of mapping out a target's digital footprint. It feeds directly from OSINT and covers:
- Domain and subdomain enumeration — What domains does this organization own? What subdomains are exposed?
- IP range identification — What IP addresses does the organization control?
- Email harvesting — What employee emails are publicly discoverable?
- Technology fingerprinting — What web server, CMS, or cloud provider are they using?
- Network topology mapping — What can we infer about how their infrastructure is connected?
Footprinting bridges the gap between raw data collection (OSINT) and targeted scanning.
Key Tools in This Phase
theHarvester
A straightforward but powerful tool for harvesting emails, subdomains, IPs, and URLs from public sources. It queries engines like Google, Bing, LinkedIn, and VirusTotal simultaneously.
theHarvester -d target.com -b google,linkedintheHarvester -d target.com -b google,linkedinBest used early — it gives you a quick overview of an organization's publicly exposed digital assets.
Maltego
A visual intelligence tool that maps relationships between entities — domains, emails, people, IP addresses, social media profiles, and more. It uses transforms (automated queries against data sources) to build a web of connections.
Where theHarvester gives you a list, Maltego gives you a graph. It's especially powerful for visualizing how employees, domains, and infrastructure interconnect — useful in corporate and enterprise engagements.
Shodan
Often called "the search engine for hackers," Shodan indexes internet-connected devices — routers, webcams, industrial control systems, databases, and servers — along with their banners, ports, and service versions.
A simple Shodan search for a target's IP range can reveal:
- Open ports and services
- SSL certificate metadata
- Software versions
- Default credentials left on exposed panels
org:"Target Company Name"org:"Target Company Name"This is passive recon at its most powerful — you're querying Shodan's existing crawl data, never touching the target directly.
Recon-ng
A full-featured reconnaissance framework built in Python, modeled after Metasploit's workflow. It uses modules to query different data sources — WHOIS, DNS, LinkedIn, breach databases, and more — and stores all results in a local database for structured analysis.
recon-ng
[recon-ng] > marketplace install all
[recon-ng] > workspaces create target_engagement
[recon-ng] > modules load recon/domains-hosts/hackertarget
[recon-ng] > options set SOURCE target.com
[recon-ng] > runrecon-ng
[recon-ng] > marketplace install all
[recon-ng] > workspaces create target_engagement
[recon-ng] > modules load recon/domains-hosts/hackertarget
[recon-ng] > options set SOURCE target.com
[recon-ng] > runRecon-ng is ideal when you want a repeatable, documented recon workflow — especially useful for reporting purposes.
The MITRE ATT&CK Connection
Reconnaissance maps directly to the Reconnaissance tactic in the MITRE ATT&CK framework (TA0043). Key techniques include:
Mapping your recon activities to ATT&CK is important for writing accurate VAPT reports — it gives clients a standardized reference point for what was done and what risks it relates to.
Recon in the Real World: The Flow
Define Scope
↓
Passive OSINT (Shodan, theHarvester, LinkedIn, WHOIS)
↓
Footprinting (subdomains, email format, IP ranges)
↓
Visual Mapping (Maltego)
↓
Structured Analysis (Recon-ng)
↓
Active Recon (only within approved scope)
↓
Feed findings into Phase 2: Scanning & EnumerationDefine Scope
↓
Passive OSINT (Shodan, theHarvester, LinkedIn, WHOIS)
↓
Footprinting (subdomains, email format, IP ranges)
↓
Visual Mapping (Maltego)
↓
Structured Analysis (Recon-ng)
↓
Active Recon (only within approved scope)
↓
Feed findings into Phase 2: Scanning & EnumerationKey Takeaways
- Reconnaissance is the foundation — weak recon leads to incomplete testing
- Always begin passively; only go active when authorized
- OSINT can reveal a surprising amount about any organization's attack surface
- Tools like theHarvester, Shodan, Maltego, and Recon-ng each serve a distinct purpose — use them together, not in isolation
- Document everything and map it to MITRE ATT&CK for professional reporting
This is Part 1 of a VAPT lifecycle series. The next post covers scanning and enumeration — where we take the map we just built and start probing it.