Your tools are only as sharp as the wordlists feeding them. Here are the 10 essential recon wordlists — what they contain, when to use them, and the exact commands that make them dangerous.

Recon is where engagements are won or lost. A misconfigured endpoint, a forgotten backup file, a leaked API route — these don't show up in Shodan. You find them by systematically throwing the right words at the right places.

But here's the dirty secret of recon: most people use the wrong wordlist for the job.

They fire rockyou.txt at a web directory. They use a 10-line DNS list against a Fortune 500 target. They run the same generic wordlist on every engagement, every time.

This post fixes that. Below are the 10 wordlists every pentester needs to know — not just what they are, but when to reach for each one and how to use them effectively.

1. SecLists — The Master Collection

GitHub: danielmiessler/SecLists

If you only clone one repository in your career, make it this one.

SecLists is not a single wordlist — it's a curated library of thousands of wordlists, organized by use case:

SecLists/
├── Discovery/          # Web content, DNS, subdomains
├── Fuzzing/            # SQLi, XSS, LFI, command injection
├── Passwords/          # Leaked databases, common credentials
├── Usernames/          # Names, handles, AD-format users
└── Web-Shells/         # Known shell filenames

Created and maintained by Daniel Miessler, SecLists is the single most referenced wordlist collection in offensive security. It ships with Kali Linux and is the assumed baseline on OSCP, HackTheBox, and real engagements alike.

Install:

apt install seclists
# or
git clone https://github.com/danielmiessler/SecLists /opt/SecLists

When to use: Always. Start here before reaching for anything else.

2. raft-large-directories.txt — Web Directory Brute Force

Location: SecLists/Discovery/Web-Content/raft-large-directories.txt Size: ~62,000 entries

Originating from the RAFT (Recon Attack Framework Tool) project, this wordlist is built from real-world crawl data — directories that actually exist on live web servers across the internet.

It's the go-to for directory enumeration because it's:

  • Large enough to catch uncommon paths
  • Clean enough to avoid massive false-positive noise
  • Structured from real observation, not theoretical guessing

Command:

ffuf -w /opt/SecLists/Discovery/Web-Content/raft-large-directories.txt \
     -u https://target.com/FUZZ \
     -fc 404,403 \
     -t 50

Pair with raft-large-files.txt for file discovery, and add -e .php,.bak,.old,.txt for backup file hunting.

gobuster dir \
  -u https://target.com \
  -w raft-large-directories.txt \
  -x php,html,bak,old,txt \
  --timeout 10s

When to use: First pass on any web target. Follow up with CMS-specific lists if you identify the tech stack.

3. rockyou.txt — The Password Standard

Location: /usr/share/wordlists/rockyou.txt (Kali) or SecLists/Passwords/Leaked-Databases/ Size: ~14 million passwords

In 2009, a breach of the social gaming site RockYou exposed 32 million plaintext passwords stored in an unsalted database. Security researchers compiled them into a wordlist. It became the most widely used password cracking list in history.

Despite being 15+ years old, rockyou.txt remains devastatingly effective because human password habits haven't changed. The same patterns dominate:

  • First name + year (jessica2003)
  • Common phrases (iloveyou, letmein)
  • Keyboard walks (qwerty123)
  • Season + year (Summer2024!)

Command (Hashcat):

hashcat -m 1000 -a 0 hashes.txt /usr/share/wordlists/rockyou.txt

With rules (significantly more effective):

hashcat -m 1000 -a 0 hashes.txt rockyou.txt \
  -r /usr/share/hashcat/rules/best64.rule

When to use: NTLM/NTLMv2 hashes from Responder, web login brute force, cracking any leaked password database. Start here before building custom lists.

4. subdomains-top1million-110000.txt — DNS Enumeration

Location: SecLists/Discovery/DNS/subdomains-top1million-110000.txt Size: 110,000 entries

Subdomain enumeration is one of the highest-value recon activities in bug bounty and external pentesting. Forgotten development subdomains, legacy APIs, and internal-facing admin panels — they all live under subdomains that aren't linked anywhere.

This list is built from analysis of the top 1 million most common subdomain prefixes observed across the internet. It covers everything from the obvious (www, mail, api) to the operational gold (dev, staging, internal, uat, vpn).

Command (ffuf DNS mode):

ffuf -w subdomains-top1million-110000.txt \
     -u https://FUZZ.target.com \
     -mc 200,301,302,403 \
     -t 100

Command (gobuster DNS):

gobuster dns \
  -d target.com \
  -w subdomains-top1million-110000.txt \
  -t 50 \
  --wildcard

When to use: Every external engagement. Run this before starting web app testing to map the full attack surface.

5. n0kovo_subdomains — SSL-Harvested DNS Gold

GitHub: n0kovo/n0kovo_subdomains Size: ~3 million entries

This is the subdomain list that professionals reach for when they need depth.

Unlike wordlists built from guessing or observation logs, n0kovo_subdomains was created by scanning the entire IPv4 address space and extracting subdomain prefixes from valid TLS/SSL certificates. This means every entry in this list has been proven to exist on a live domain at some point.

The result is a massive, high-signal list that catches real-world subdomains that generic lists miss entirely — internal naming conventions, microservice patterns, regional prefixes.

ffuf -w n0kovo_subdomains.txt \
     -u https://FUZZ.target.com \
     -mc all \
     -fc 400,404 \
     -t 200 \
     -o results.json

When to use: Deep-dive engagements, bug bounty targets with large scopes, or when the top1million list isn't finding enough. Slower due to size — run overnight.

6. api-endpoints.txt — API Route Discovery

Location: SecLists/Discovery/Web-Content/api/api-endpoints.txt Also: chrislockard/api_wordlist

Modern web applications are built on APIs — and APIs are riddled with unauthenticated endpoints, broken object-level authorization, and hidden debug routes.

Generic directory wordlists miss API-specific patterns like:

  • /v1/, /v2/, /v3/ versioning paths
  • REST resource names (/users, /accounts, /orders)
  • Internal routes (/health, /debug, /metrics, /actuator)
  • GraphQL endpoints (/graphql, /gql, /api/graphql)

The api-endpoints.txt list in SecLists is purpose-built for this. Combine it with burp-parameter-names.txt for parameter fuzzing.

Command:

ffuf -w api-endpoints.txt \
     -u https://target.com/api/FUZZ \
     -mc 200,201,204,400,401,403 \
     -H "Content-Type: application/json" \
     -t 40

Double fuzzing (path + version):

ffuf -w versions.txt:V -w api-endpoints.txt:E \
     -u https://target.com/V/E \
     -mc 200,401,403

When to use: Any modern web app, especially SPAs, mobile backends, and microservice architectures. Don't skip this — unauthenticated API endpoints are some of the most common critical findings.

7. LFI-Jhaddix.txt — Local File Inclusion Payloads

Location: SecLists/Fuzzing/LFI/LFI-Jhaddix.txt

When you've identified a potential file inclusion parameter — a ?page=, ?file=, ?path= — the question becomes: which traversal payloads actually bypass the filters?

Jhaddix's LFI list is the most comprehensive collection of path traversal and local file inclusion payloads in public tooling. It includes:

  • Classic ../../../etc/passwd and variations
  • URL-encoded bypass variants (%2e%2e%2f)
  • Double-encoding (%252e%252e%252f)
  • Null byte injection (../etc/passwd%00)
  • Platform-specific paths (Windows, Linux, macOS)
  • PHP filter chains and wrappers (php://filter/convert...)
ffuf -w LFI-Jhaddix.txt \
     -u "https://target.com/page?file=FUZZ" \
     -mr "root:" \
     -t 30

When to use: Whenever a parameter reads or includes files. Also use for SSRF path testing and path normalization bypasses.

8. CrackStation — Serious Password Cracking

Source: crackstation.net Size: 1.5 billion passwords (15GB)

When rockyou.txt fails, CrackStation is the next escalation.

This list was compiled from every major public breach and leaked password database — including LinkedIn, Adobe, MySpace, and hundreds of others. At 1.5 billion entries, it covers an enormous percentage of passwords humans have ever actually used.

It's not a daily driver — the size makes it impractical for most operations — but when you're cracking hashes from a high-value target and rockyou.txt comes up empty, CrackStation regularly recovers passwords that nothing else will.

hashcat -m 1000 -a 0 hashes.txt crackstation.txt \
  --potfile-path crackstation.pot \
  -O --status

When to use: Post-exploitation hash cracking when standard lists fail. Run on a GPU-equipped machine. Not suitable for online brute force due to size.

9. OneListForAll — The Consolidated Mega-List

GitHub: six2dez/onelistforall

When you need one wordlist to cover everything — directories, files, API paths, backup files, common extensions — OneListForAll is built for exactly that scenario.

Created by six2dez, it merges and deduplicates dozens of the best wordlists from SecLists, fuzz.db, and community sources into a single consolidated file. The result is a single highly effective wordlist that performs well across different target types without requiring you to chain multiple lists manually.

git clone https://github.com/six2dez/OneListForAll
ffuf -w OneListForAll/onelistforall.txt \
     -u https://target.com/FUZZ \
     -mc 200,204,301,302,401,403 \
     -t 50 \
     -o output.json

When to use: Time-constrained engagements, CTF challenges, or as a primary list when you don't have time to chain multiple specialized lists.

10. CeWL — Generate Your Own From the Target

GitHub: digininja/CeWL (Custom Word List Generator)

Every wordlist above is generic. CeWL is different — it generates a wordlist specific to your target by crawling the target's website and extracting words from the content.

Why does this matter? Because organizations tend to use their own terminology:

  • Product names as passwords (TurboWidget2024)
  • Internal project codenames as directory names
  • Employee names as usernames and endpoints

CeWL scrapes, extracts, and outputs a custom wordlist that none of the public lists contain.

# Generate wordlist from target (min 6 chars, depth 3, with emails)
cewl -d 3 -m 6 --email https://target.com -w custom_target.txt

# Combine with rockyou for password attacks
cat custom_target.txt rockyou.txt | sort -u > combined.txt
# Use for directory fuzzing
ffuf -w custom_target.txt \
     -u https://target.com/FUZZ \
     -mc 200,301,302

When to use: Always. Run CeWL on every target as part of recon, then combine the output with standard lists. The findings from custom lists are often the most impactful.

Pro Tips

1. Always deduplicate before running:

sort -u wordlist.txt -o wordlist.txt

2. Combine lists intelligently:

cat raft-large-directories.txt custom_target.txt | sort -u > combined.txt

3. Use appropriate thread counts: Web servers can rate-limit or block aggressive scanning. Start with 50 threads and adjust based on response times.

4. Keep SecLists updated:

cd /opt/SecLists && git pull

5. Extension awareness: Always consider which file extensions are relevant to your target stack. -e php,asp,aspx,jsp,bak,old,txt,xml,json covers most cases.