"The internet forgets nothing. It only hides things badly."

Disclaimer: All techniques in this document are for authorized security research, bug bounty programs, and educational purposes only. Unauthorized use against systems you do not own or have explicit written permission to test is illegal. The author assumes no liability.

Table of Contents

The Philosophy of Passive Reconnaissance
Operator Logic: The Grammar of Dorking
Advanced Operator Chaining
Vulnerability Case Studies

4.1 Exposed Environment & Config Files
4.2 SQL Dumps & Database Backups
4.3 Open Directories & Sensitive Logs
4.4 IoT Devices & Exposed Dashboards

The OSINT Operator's Reference Sheet
Mastery Lab: 15 Progressive Exercises
Operational Security & Legal Boundaries
Further Resources

1. The Philosophy of Passive Reconnaissance

Before a single packet hits a target network, before a scanner spins up, before social engineering enters the picture — there is Google Dorking. It is the art of weaponizing the world's most powerful indexing engine against its own corpus.

Search engines don't just index content. They index mistakes: misconfigured web servers, forgotten backup files, publicly accessible admin panels, credentials left in config files, and directory listings that were never meant to face the internet. Every one of these is a data point. Aggregate enough data points and you have a target profile — all without generating a single log entry on the target's systems.

This is the core advantage of passive recon: zero footprint. You are not connecting to the target. You are reading what the internet already knows about them.

Dorking is built on Google's Advanced Search Operators — boolean and syntactic filters that narrow the search index with surgical precision. The technique is also known as Google Hacking, popularized by Johnny Long's Google Hacking Database (GHDB) and the Exploit-DB project. The same principles apply to Bing, DuckDuckGo, Shodan, Censys, and ZoomEye.

This document teaches you the complete grammar of that language.

2. Operator Logic: The Grammar of Dorking

Think of Google's index as a relational database. Every web page is a row. Search operators are WHERE clauses. Combine them correctly and you SELECT only the rows that matter.

2.1 Core Operator Reference

`site:`

Function: Restricts results to a specific domain or subdomain. Syntax: site:example.com Logic: WHERE domain = 'example.com'

site:tesla.com filetype:pdf
site:*.gov confidential
site:staging.example.com

site:tesla.com filetype:pdf
site:*.gov confidential
site:staging.example.com

Inversion: -site:www.example.com excludes the primary domain, forcing Google to surface subdomains and alternative properties.

site:example.com -site:www.example.com

site:example.com -site:www.example.com

`intitle:`

Function: Matches search terms against the HTML <title> tag of a page. Syntax: intitle:"keyword" Logic: WHERE page_title LIKE '%keyword%'

Multi-word variant: allintitle:word1 word2 requires both words in title.

intitle:"index of"
intitle:"admin panel" site:example.com
allintitle:login portal admin

intitle:"index of"
intitle:"admin panel" site:example.com
allintitle:login portal admin

Why it matters: Page titles are rarely sanitized by developers. Admin panels, error pages, and directory listings broadcast their nature in the title.

`inurl:`

Function: Matches search terms against the URL path. Syntax: inurl:"/admin" Logic: WHERE url_path CONTAINS '/admin'

Multi-word variant: allinurl:word1 word2

inurl:"/wp-admin/admin-ajax.php"
inurl:"/phpinfo.php" site:example.com
inurl:"?id=" inurl:".php"

inurl:"/wp-admin/admin-ajax.php"
inurl:"/phpinfo.php" site:example.com
inurl:"?id=" inurl:".php"

Why it matters: URL structures reveal application frameworks, parameter names, and hidden endpoints. A URL containing /api/v1/internal/ tells you more than any documentation.

`filetype:` / `ext:`

Function: Restricts results to a specific file extension. filetype: and ext: are functionally identical on Google. Syntax: filetype:pdf or ext:sql Logic: WHERE file_extension = 'pdf'

filetype:env site:example.com
ext:sql "INSERT INTO"
filetype:log inurl:"/var/log"
ext:bak site:example.com

filetype:env site:example.com
ext:sql "INSERT INTO"
filetype:log inurl:"/var/log"
ext:bak site:example.com

Why it matters: This is the single most powerful operator for finding exposed sensitive files. Developers deploy backup files (.bak, .old, .orig), leave configuration files (.env, .config, .yaml) accessible, and forget SQL dumps in web-accessible directories.

`intext:`

Function: Searches the full body text (content) of indexed pages. Syntax: intext:"password" Logic: WHERE page_body CONTAINS 'password'

intext:"DB_PASSWORD" filetype:env
intext:"-----BEGIN RSA PRIVATE KEY-----"
intext:"mysql_connect" filetype:php

intext:"DB_PASSWORD" filetype:env
intext:"-----BEGIN RSA PRIVATE KEY-----"
intext:"mysql_connect" filetype:php

`cache:`

Function: Returns Google's cached (archived) snapshot of a page. Syntax: cache:example.com/page Use case: Accessing content that has since been removed or taken offline. Useful for recovering deleted data during an investigation.

cache:example.com/admin/config

cache:example.com/admin/config

Note: cache: is increasingly deprecated in Google's interface but still functions in direct URL form: https://webcache.googleusercontent.com/search?q=cache:example.com

`before:` and `after:`

Function: Filters results by index date. Syntax: before:YYYY-MM-DD / after:YYYY-MM-DD

site:example.com after:2023-01-01 filetype:pdf
intext:"password" filetype:env after:2022-06-01

site:example.com after:2023-01-01 filetype:pdf
intext:"password" filetype:env after:2022-06-01

Boolean Modifiers

Modifier Symbol Behavior AND (implicit / space) Both terms must appear OR OR or | Either term may appear NOT - (minus prefix) Exclude this term Exact match "..." (quotes) Literal string match Wildcard * (asterisk) Matches any single word

"admin" OR "administrator" inurl:login site:example.com
intitle:"index of" -"parent directory"   # Exclude false positives
"password" * "username" filetype:txt

"admin" OR "administrator" inurl:login site:example.com
intitle:"index of" -"parent directory"   # Exclude false positives
"password" * "username" filetype:txt

2.2 Operator Scope Rules

Understanding scope prevents false positives and wasted time:

Operators without quotes match partial strings: inurl:admin matches /admin, /administrator, /adminpanel
Quoted operators enforce exact substring: inurl:"/admin/" matches only /admin/
Multiple operators of the same type use AND logic by default: intitle:login inurl:admin requires BOTH conditions
Minus operator must precede the term with no space: -site:example.com (correct) vs - site:example.com (wrong)
site: is the scope anchor — always place it first or last for readability

3. Advanced Operator Chaining

Individual operators are picks. Chained operators are crowbars.

3.1 The Chaining Formula

[SCOPE] + [LOCATION FILTER] + [CONTENT FILTER] + [FILE TYPE] + [NEGATIVE EXCLUSIONS]

[SCOPE] + [LOCATION FILTER] + [CONTENT FILTER] + [FILE TYPE] + [NEGATIVE EXCLUSIONS]

Exemplar chain:

site:example.com inurl:"/backup" ext:sql -"access denied"

site:example.com inurl:"/backup" ext:sql -"access denied"

Decomposed:

site:example.com → Target scope
inurl:"/backup" → Location filter (URL must contain /backup)
ext:sql → File type filter
-"access denied" → Exclude pages that are protected (reduces noise)

3.2 High-Yield Chain Patterns

Pattern: Subdomain Enumeration via `site:` inversion

site:*.example.com -site:www.example.com -site:mail.example.com

site:*.example.com -site:www.example.com -site:mail.example.com

Google indexes subdomains. This surfaces staging, dev, api, and internal subdomains that certificate transparency logs may miss.

Pattern: Parameter Discovery for Injection Testing

site:example.com inurl:"?" inurl:".php" -inurl:".js"

site:example.com inurl:"?" inurl:".php" -inurl:".js"

Surfaces PHP pages with GET parameters — primary attack surface for SQLi and LFI.

Pattern: Exposed Panels & Interfaces

intitle:"dashboard" inurl:"/admin" site:example.com
intitle:"phpMyAdmin" inurl:"/phpmyadmin" -demo -tutorial
intitle:"Kibana" inurl:":5601"

intitle:"dashboard" inurl:"/admin" site:example.com
intitle:"phpMyAdmin" inurl:"/phpmyadmin" -demo -tutorial
intitle:"Kibana" inurl:":5601"

Pattern: Version Disclosure for CVE Matching

intitle:"Apache/2.4.49" "server at"
intext:"Powered by WordPress 5.8" site:example.com
intitle:"IIS Windows Server" intext:"IIS/10.0"

intitle:"Apache/2.4.49" "server at"
intext:"Powered by WordPress 5.8" site:example.com
intitle:"IIS Windows Server" intext:"IIS/10.0"

Extract version strings, cross-reference with NVD or Exploit-DB.

Pattern: Cloud & Infrastructure Enumeration

site:s3.amazonaws.com "example"
site:blob.core.windows.net "example"
site:storage.googleapis.com "example"

site:s3.amazonaws.com "example"
site:blob.core.windows.net "example"
site:storage.googleapis.com "example"

Discovers public cloud storage buckets associated with a target — a goldmine for misconfiguration findings.

Pattern: API Key & Token Exposure in GitHub (via Google)

site:github.com "example.com" "api_key" OR "secret_key" OR "access_token"
site:pastebin.com "example.com" password

site:github.com "example.com" "api_key" OR "secret_key" OR "access_token"
site:pastebin.com "example.com" password

Pattern: Error Message Exploitation

Error messages are intelligence leakage. They reveal stack traces, internal paths, database schemas, and framework versions.

site:example.com intext:"Fatal error" intext:"on line"
site:example.com intext:"mysqli_connect()" intext:"failed"
site:example.com intext:"Warning: include(" intext:"failed to open stream"

site:example.com intext:"Fatal error" intext:"on line"
site:example.com intext:"mysqli_connect()" intext:"failed"
site:example.com intext:"Warning: include(" intext:"failed to open stream"

Pattern: The "Dork of Dorks" — Credential Search

filetype:txt intext:"username" intext:"password" intext:"email" site:example.com
filetype:csv intext:"password" site:example.com
intext:"login:" intext:"password:" filetype:log

filetype:txt intext:"username" intext:"password" intext:"email" site:example.com
filetype:csv intext:"password" site:example.com
intext:"login:" intext:"password:" filetype:log

4. Vulnerability Case Studies

4.1 Exposed Environment & Config Files

Environment files (.env, .config, .ini, .yaml) are the single most common critical misconfiguration found via dorking. They contain database credentials, API keys, JWT secrets, OAuth tokens, and SMTP credentials — often for production systems.

Dork Strings

# Primary .env discovery
filetype:env "DB_PASSWORD"
filetype:env "APP_KEY"
filetype:env intext:"DB_HOST" intext:"DB_USER"
ext:env "SECRET_KEY"
# Laravel / Symfony / Node
filetype:env "APP_ENV=production"
filetype:env intext:"MAIL_PASSWORD"
filetype:env intext:"STRIPE_SECRET"
filetype:env intext:"AWS_SECRET_ACCESS_KEY"
# Configuration files across frameworks
ext:config intext:"connectionString"
ext:xml intext:"<password>" intext:"<username>"
ext:yml intext:"password:" intext:"host:"
ext:yaml intext:"db_pass:" -site:github.com
ext:ini intext:"password" intext:"user"
# Exposed application config backups
inurl:"config.php.bak"
inurl:"wp-config.php.bak" OR inurl:"wp-config.php~"
filetype:bak inurl:"config"
ext:old inurl:"settings"

# Primary .env discovery
filetype:env "DB_PASSWORD"
filetype:env "APP_KEY"
filetype:env intext:"DB_HOST" intext:"DB_USER"
ext:env "SECRET_KEY"
# Laravel / Symfony / Node
filetype:env "APP_ENV=production"
filetype:env intext:"MAIL_PASSWORD"
filetype:env intext:"STRIPE_SECRET"
filetype:env intext:"AWS_SECRET_ACCESS_KEY"
# Configuration files across frameworks
ext:config intext:"connectionString"
ext:xml intext:"<password>" intext:"<username>"
ext:yml intext:"password:" intext:"host:"
ext:yaml intext:"db_pass:" -site:github.com
ext:ini intext:"password" intext:"user"
# Exposed application config backups
inurl:"config.php.bak"
inurl:"wp-config.php.bak" OR inurl:"wp-config.php~"
filetype:bak inurl:"config"
ext:old inurl:"settings"

What You're Looking For

# Typical .env content (DO NOT use against real systems)
DB_HOST=prod-db.example.internal
DB_DATABASE=customers_production
DB_USERNAME=root
DB_PASSWORD=Sup3rS3cur3!
AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE
AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
STRIPE_SECRET=sk_live_XXXXXXXXXXXXXXXXXXXX
JWT_SECRET=my-super-secret-jwt-key

# Typical .env content (DO NOT use against real systems)
DB_HOST=prod-db.example.internal
DB_DATABASE=customers_production
DB_USERNAME=root
DB_PASSWORD=Sup3rS3cur3!
AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE
AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
STRIPE_SECRET=sk_live_XXXXXXXXXXXXXXXXXXXX
JWT_SECRET=my-super-secret-jwt-key

Why This Happens

Developers use .env files locally, push them to production servers, and forget to add them to .gitignore or robots.txt. Web server misconfigurations (e.g., no file extension MIME type restriction in Apache/Nginx) allow direct HTTP access to these files.

4.2 SQL Dumps & Database Backups

Database backups are time-bombs. DBAs and developers routinely dump databases for migration, testing, or backup purposes and place them in web-accessible directories "temporarily."

Dork Strings

# SQL dumps
ext:sql "INSERT INTO" "VALUES"
ext:sql intext:"CREATE TABLE"
ext:sql intext:"DROP TABLE IF EXISTS"
filetype:sql intext:"-- phpMyAdmin SQL Dump"
filetype:sql intext:"Dumping data for table"
# Compressed backups (often contain full DB dumps)
ext:gz inurl:"backup" intext:"sql"
ext:zip inurl:"backup" intext:"database"
ext:tar.gz inurl:"db_backup"
ext:7z inurl:"database" site:example.com
# MySQL-specific
filetype:sql intext:"mysql_connect"
inurl:"mysqldump" ext:sql
filetype:sql intext:"ENGINE=InnoDB"
# Specific backup file naming conventions
inurl:"backup_db" ext:sql
inurl:"db_dump" ext:sql
inurl:"database.sql"
inurl:"prod_backup" ext:sql

# SQL dumps
ext:sql "INSERT INTO" "VALUES"
ext:sql intext:"CREATE TABLE"
ext:sql intext:"DROP TABLE IF EXISTS"
filetype:sql intext:"-- phpMyAdmin SQL Dump"
filetype:sql intext:"Dumping data for table"
# Compressed backups (often contain full DB dumps)
ext:gz inurl:"backup" intext:"sql"
ext:zip inurl:"backup" intext:"database"
ext:tar.gz inurl:"db_backup"
ext:7z inurl:"database" site:example.com
# MySQL-specific
filetype:sql intext:"mysql_connect"
inurl:"mysqldump" ext:sql
filetype:sql intext:"ENGINE=InnoDB"
# Specific backup file naming conventions
inurl:"backup_db" ext:sql
inurl:"db_dump" ext:sql
inurl:"database.sql"
inurl:"prod_backup" ext:sql

Why This Happens

mysqldump outputs .sql files by default. Developers often place them at the web root (/var/www/html/backup.sql) for easy SCP or SFTP retrieval, then forget them. Apache's default configuration serves any readable file in the web root.

Impact Assessment

A single SQL dump can contain:

Full user table with bcrypt/MD5 hashes (crackable offline)
PII: email addresses, phone numbers, physical addresses
Payment metadata (if GDPR/PCI compliance was not enforced at schema level)
Internal business logic exposed via stored procedures

4.3 Open Directories & Sensitive Log Files

An "open directory" (or "directory listing") occurs when a web server is configured to render a browsable list of files in a directory when no index.html/index.php exists. This is the digital equivalent of leaving your filing cabinet open.

Dork Strings — Open Directories

# Classic open directory signatures
intitle:"index of"
intitle:"Index of /" "parent directory"
intitle:"index of" "last modified"
# Targeted directory types
intitle:"index of" "backup"
intitle:"index of" "/etc"
intitle:"index of" "logs"
intitle:"index of" ".ssh"
intitle:"index of" "private"
intitle:"index of" "passwords"
intitle:"index of" "/uploads"
# Combined with site scope
intitle:"index of" site:example.com
intitle:"index of" ext:sql site:example.com

# Classic open directory signatures
intitle:"index of"
intitle:"Index of /" "parent directory"
intitle:"index of" "last modified"
# Targeted directory types
intitle:"index of" "backup"
intitle:"index of" "/etc"
intitle:"index of" "logs"
intitle:"index of" ".ssh"
intitle:"index of" "private"
intitle:"index of" "passwords"
intitle:"index of" "/uploads"
# Combined with site scope
intitle:"index of" site:example.com
intitle:"index of" ext:sql site:example.com

Dork Strings — Log Files

Log files are a treasure chest for OSINT and red team operators. They contain usernames, internal IP addresses, error stack traces, API endpoints, and sometimes plaintext passwords (from failed login attempts that included the password in the URL).

# General log file exposure
ext:log inurl:"/logs/"
ext:log intext:"error" intext:"warning"
filetype:log "password"
filetype:log intext:"username" intext:"failed"
# Application-specific logs
ext:log intext:"PHP Fatal error"
ext:log inurl:"access_log"
ext:log inurl:"error_log"
ext:log intext:"Apache" intext:"GET /"
# Authentication logs (extremely sensitive)
filetype:log intext:"authentication failure"
filetype:log intext:"Invalid user" intext:"ssh"
filetype:log intext:"Failed password for"
# Application framework logs
inurl:"/storage/logs/laravel.log"
inurl:"/application/logs/"
filetype:log intext:"Stack trace:"

# General log file exposure
ext:log inurl:"/logs/"
ext:log intext:"error" intext:"warning"
filetype:log "password"
filetype:log intext:"username" intext:"failed"
# Application-specific logs
ext:log intext:"PHP Fatal error"
ext:log inurl:"access_log"
ext:log inurl:"error_log"
ext:log intext:"Apache" intext:"GET /"
# Authentication logs (extremely sensitive)
filetype:log intext:"authentication failure"
filetype:log intext:"Invalid user" intext:"ssh"
filetype:log intext:"Failed password for"
# Application framework logs
inurl:"/storage/logs/laravel.log"
inurl:"/application/logs/"
filetype:log intext:"Stack trace:"

Dork Strings — SSH & Cryptographic Key Material

ext:pem intext:"BEGIN RSA PRIVATE KEY"
ext:key intext:"PRIVATE KEY"
filetype:ppk intext:"PuTTY-User-Key-File"
intext:"-----BEGIN OPENSSH PRIVATE KEY-----" filetype:txt
intext:"-----BEGIN EC PRIVATE KEY-----"

ext:pem intext:"BEGIN RSA PRIVATE KEY"
ext:key intext:"PRIVATE KEY"
filetype:ppk intext:"PuTTY-User-Key-File"
intext:"-----BEGIN OPENSSH PRIVATE KEY-----" filetype:txt
intext:"-----BEGIN EC PRIVATE KEY-----"

4.4 IoT Devices & Exposed Dashboards

The real-time web is full of industrial control systems, IP cameras, network appliances, and monitoring dashboards that were deployed with default credentials and zero authentication — fully indexed by Google and Shodan.

Dork Strings — Network Devices

# Router & switch admin panels
intitle:"RouterOS" inurl:"/winbox"
intitle:"NETGEAR" inurl:"/setup.cgi"
intitle:"Cisco IOS" inurl:"/level/15/exec/-/show"
intitle:"DD-WRT" inurl:"/Management.asp"
intitle:"pfSense" inurl:"/index.php" -demo
# Printers (often expose network configs and recent print jobs)
intitle:"Printer Status" inurl:"/printer/main.html"
intitle:"HP LaserJet" inurl:"/hp/device/info_deviceStatus.xml"
intitle:"RICOH" inurl:"/web/guest/en/websys/"

# Router & switch admin panels
intitle:"RouterOS" inurl:"/winbox"
intitle:"NETGEAR" inurl:"/setup.cgi"
intitle:"Cisco IOS" inurl:"/level/15/exec/-/show"
intitle:"DD-WRT" inurl:"/Management.asp"
intitle:"pfSense" inurl:"/index.php" -demo
# Printers (often expose network configs and recent print jobs)
intitle:"Printer Status" inurl:"/printer/main.html"
intitle:"HP LaserJet" inurl:"/hp/device/info_deviceStatus.xml"
intitle:"RICOH" inurl:"/web/guest/en/websys/"

Dork Strings — IP Cameras & Video Feeds

intitle:"Live View / - AXIS"
intitle:"webcamXP" inurl:":8080"
inurl:"/axis-cgi/mjpg/video.cgi"
intitle:"Network Camera" inurl:"/view/index.shtml"
intitle:"IP Camera" inurl:"/view/viewer_index.shtml"
inurl:"/CgiStart?page=Single"

intitle:"Live View / - AXIS"
intitle:"webcamXP" inurl:":8080"
inurl:"/axis-cgi/mjpg/video.cgi"
intitle:"Network Camera" inurl:"/view/index.shtml"
intitle:"IP Camera" inurl:"/view/viewer_index.shtml"
inurl:"/CgiStart?page=Single"

Dork Strings — Industrial Control Systems (ICS/SCADA)

intitle:"SCADA" inurl:"/main.html"
intitle:"Modbus" inurl:":502"
intitle:"Schneider Electric" inurl:"/index.html"
intext:"PLCopen" inurl:"/webvisu.htm"
intitle:"Siemens" inurl:"/portal/portal.mwsl"

intitle:"SCADA" inurl:"/main.html"
intitle:"Modbus" inurl:":502"
intitle:"Schneider Electric" inurl:"/index.html"
intext:"PLCopen" inurl:"/webvisu.htm"
intitle:"Siemens" inurl:"/portal/portal.mwsl"

⚠️ Critical Warning: ICS/SCADA systems control physical infrastructure. Unauthorized access is a federal crime in most jurisdictions and can cause physical harm. Research access ONLY in authorized lab environments.

Dork Strings — Monitoring & Analytics Dashboards

# Grafana (unauthenticated instances)
intitle:"Grafana" inurl:":3000"
inurl:"/d/" intitle:"Grafana"
# Kibana (exposed Elasticsearch data)
intitle:"Kibana" inurl:":5601/app/kibana"
# Prometheus
intitle:"Prometheus" inurl:":9090/graph"
# Jupyter Notebooks (often contain API keys in cells)
intitle:"Jupyter Notebook" inurl:"/tree"
# Exposed Jenkins CI/CD
intitle:"Dashboard [Jenkins]"
intitle:"Jenkins" inurl:"/manage"

# Grafana (unauthenticated instances)
intitle:"Grafana" inurl:":3000"
inurl:"/d/" intitle:"Grafana"
# Kibana (exposed Elasticsearch data)
intitle:"Kibana" inurl:":5601/app/kibana"
# Prometheus
intitle:"Prometheus" inurl:":9090/graph"
# Jupyter Notebooks (often contain API keys in cells)
intitle:"Jupyter Notebook" inurl:"/tree"
# Exposed Jenkins CI/CD
intitle:"Dashboard [Jenkins]"
intitle:"Jenkins" inurl:"/manage"

Dork Strings — Database Admin Interfaces

intitle:"phpMyAdmin" inurl:"/phpmyadmin"
intitle:"phpMyAdmin" "running on" inurl:"main.php"
intitle:"Adminer" inurl:"/adminer.php"
intitle:"MongoDB" inurl:":27017"
intitle:"Redis" inurl:":6379"
inurl:"/solr/admin"
intitle:"Elasticsearch" inurl:":9200/_cat/indices"

intitle:"phpMyAdmin" inurl:"/phpmyadmin"
intitle:"phpMyAdmin" "running on" inurl:"main.php"
intitle:"Adminer" inurl:"/adminer.php"
intitle:"MongoDB" inurl:":27017"
intitle:"Redis" inurl:":6379"
inurl:"/solr/admin"
intitle:"Elasticsearch" inurl:":9200/_cat/indices"

5. The OSINT Operator's Reference Sheet

A consolidated cheat sheet for field use:

┌─────────────────────────────────────────────────────────────────────┐
│                    DORKING QUICK REFERENCE                          │
├──────────────────┬──────────────────────────────────────────────────┤
│ OPERATOR         │ USE CASE                                         │
├──────────────────┼──────────────────────────────────────────────────┤
│ site:            │ Scope to domain / enumerate subdomains           │
│ intitle:         │ Page title match → admin panels, dir listings    │
│ inurl:           │ URL path match → endpoints, params, file paths   │
│ filetype:/ext:   │ File extension → env, sql, log, bak, pem, yml   │
│ intext:          │ Body content match → credentials, error msgs     │
│ cache:           │ View archived/deleted page content               │
│ before:/after:   │ Time-bound searches for historical data          │
│ -                │ Exclude term → reduce noise, filter false pos    │
│ " "              │ Exact phrase match → reduce false positives      │
│ *                │ Wildcard → flexible phrase matching              │
│ OR / |           │ Multi-term matching → broaden results            │
└──────────────────┴──────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────────────────────┐
│                 TOP SENSITIVE EXTENSIONS                            │
├───────────┬──────────────────────────────────────────────────────┤
│ .env      │ App environment — DB creds, API keys, secrets          │
│ .sql      │ Database dumps — schemas, user hashes, data            │
│ .log      │ App/server logs — usernames, IPs, errors, passwords    │
│ .bak/.old │ Backup files — source code, configs, DB dumps          │
│ .pem/.key │ Cryptographic keys — SSH, SSL, code signing            │
│ .yaml/.yml│ Infrastructure configs — K8s, Docker, Ansible          │
│ .config   │ Application configs — often contain credentials        │
│ .ppk      │ PuTTY private keys                                      │
│ .htpasswd │ Apache password files                                   │
│ .csv/.xls │ Data exports — PII, customer records, credentials      │
└───────────┴──────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────────┐
│                    DORKING QUICK REFERENCE                          │
├──────────────────┬──────────────────────────────────────────────────┤
│ OPERATOR         │ USE CASE                                         │
├──────────────────┼──────────────────────────────────────────────────┤
│ site:            │ Scope to domain / enumerate subdomains           │
│ intitle:         │ Page title match → admin panels, dir listings    │
│ inurl:           │ URL path match → endpoints, params, file paths   │
│ filetype:/ext:   │ File extension → env, sql, log, bak, pem, yml   │
│ intext:          │ Body content match → credentials, error msgs     │
│ cache:           │ View archived/deleted page content               │
│ before:/after:   │ Time-bound searches for historical data          │
│ -                │ Exclude term → reduce noise, filter false pos    │
│ " "              │ Exact phrase match → reduce false positives      │
│ *                │ Wildcard → flexible phrase matching              │
│ OR / |           │ Multi-term matching → broaden results            │
└──────────────────┴──────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────────────────────┐
│                 TOP SENSITIVE EXTENSIONS                            │
├───────────┬──────────────────────────────────────────────────────┤
│ .env      │ App environment — DB creds, API keys, secrets          │
│ .sql      │ Database dumps — schemas, user hashes, data            │
│ .log      │ App/server logs — usernames, IPs, errors, passwords    │
│ .bak/.old │ Backup files — source code, configs, DB dumps          │
│ .pem/.key │ Cryptographic keys — SSH, SSL, code signing            │
│ .yaml/.yml│ Infrastructure configs — K8s, Docker, Ansible          │
│ .config   │ Application configs — often contain credentials        │
│ .ppk      │ PuTTY private keys                                      │
│ .htpasswd │ Apache password files                                   │
│ .csv/.xls │ Data exports — PII, customer records, credentials      │
└───────────┴──────────────────────────────────────────────────────┘

6. Mastery Lab: 15 Progressive Exercises

The following tasks are progressive. Each builds on the last. All tasks should be performed on domains you own, in authorized bug bounty scopes, or against legal OSINT practice targets (e.g., scanme.nmap.org, hackthebox.eu, tryhackme.com, vulnhub.com, HackerOne/Bugcrowd public programs).

Tier 1 — Fundamentals (Tasks 1–4)

Task 1: Basic Site Enumeration

Objective: Map the visible surface of a target domain. Difficulty: ★☆☆☆☆

site:example.com

site:example.com

Expected output: All indexed pages under the domain. Next steps: Count results. Check for subdomains, unusual pages, admin paths in URLs.

Variations to try:

site:example.com -site:www.example.com    # Subdomain enumeration
site:example.com filetype:pdf             # Document enumeration
site:example.com inurl:login              # Authentication surface

site:example.com -site:www.example.com    # Subdomain enumeration
site:example.com filetype:pdf             # Document enumeration
site:example.com inurl:login              # Authentication surface

Task 2: File Type Mapping

Objective: Enumerate all file types publicly indexed for a domain. Difficulty: ★☆☆☆☆

Run this sequence for a target domain:

site:example.com filetype:pdf
site:example.com filetype:doc OR filetype:docx
site:example.com filetype:xls OR filetype:xlsx
site:example.com filetype:txt
site:example.com filetype:xml

site:example.com filetype:pdf
site:example.com filetype:doc OR filetype:docx
site:example.com filetype:xls OR filetype:xlsx
site:example.com filetype:txt
site:example.com filetype:xml

Analysis prompt: What can the document metadata (author, creation date, software version) reveal about internal tooling? Use ExifTool on downloaded files.

Task 3: Title-Based Panel Discovery

Objective: Identify exposed admin and management interfaces. Difficulty: ★★☆☆☆

intitle:"admin" OR intitle:"dashboard" OR intitle:"control panel" site:example.com

intitle:"admin" OR intitle:"dashboard" OR intitle:"control panel" site:example.com

Bonus: Cross-reference found panels against searchsploit for known CVEs matching the software version disclosed in the page title.

Task 4: Open Directory Hunting

Objective: Find directory listings on a target domain. Difficulty: ★★☆☆☆

intitle:"index of" site:example.com
intitle:"index of" "parent directory" site:example.com

intitle:"index of" site:example.com
intitle:"index of" "parent directory" site:example.com

Document: List every exposed directory. Note file types present. Flag any directories containing config, backup, log, or credential files for escalation.

Tier 2 — Intermediate (Tasks 5–9)

Task 5: Credential File Discovery

Objective: Locate exposed .env and config files. Difficulty: ★★★☆☆

site:example.com filetype:env
site:example.com ext:config
site:example.com inurl:".env" -inurl:".env."
site:example.com intext:"DB_PASSWORD"

site:example.com filetype:env
site:example.com ext:config
site:example.com inurl:".env" -inurl:".env."
site:example.com intext:"DB_PASSWORD"

Report format:

URL: [discovered URL]
Severity: Critical
Contents: [types of credentials found]
Remediation: Remove file from web root; rotate all exposed credentials immediately

URL: [discovered URL]
Severity: Critical
Contents: [types of credentials found]
Remediation: Remove file from web root; rotate all exposed credentials immediately

Task 6: Log File Exposure Mapping

Objective: Surface application and server log files. Difficulty: ★★★☆☆

site:example.com ext:log
site:example.com inurl:"/logs/" -inurl:"robots.txt"
site:example.com filetype:log intext:"error"

site:example.com ext:log
site:example.com inurl:"/logs/" -inurl:"robots.txt"
site:example.com filetype:log intext:"error"

OSINT extraction targets within logs:

Internal IP address ranges (RFC 1918 vs. public)
Usernames from authentication failures
API endpoint paths not visible in public documentation
Software/framework version strings from stack traces

Task 7: SQL Dump & Backup Discovery

Objective: Identify exposed database artifacts. Difficulty: ★★★☆☆

site:example.com ext:sql
site:example.com inurl:"backup" ext:zip OR ext:gz OR ext:tar
site:example.com inurl:"db_backup"
site:example.com filetype:sql intext:"INSERT INTO"

site:example.com ext:sql
site:example.com inurl:"backup" ext:zip OR ext:gz OR ext:tar
site:example.com inurl:"db_backup"
site:example.com filetype:sql intext:"INSERT INTO"

Escalation path: If a .sql dump is found, analyze the schema for user tables. Extract password hashes for offline cracking with Hashcat: hashcat -m 3200 hashes.txt rockyou.txt (bcrypt mode).

Task 8: Cloud Storage Bucket Enumeration

Objective: Find publicly accessible cloud storage linked to a target. Difficulty: ★★★☆☆

site:s3.amazonaws.com "targetname"
site:blob.core.windows.net "targetname"
site:storage.googleapis.com "targetname"
inurl:"targetname.s3.amazonaws.com"

site:s3.amazonaws.com "targetname"
site:blob.core.windows.net "targetname"
site:storage.googleapis.com "targetname"
inurl:"targetname.s3.amazonaws.com"

Verification: Once a bucket is found, attempt enumeration with aws s3 ls s3://bucketname --no-sign-request. A successful listing indicates a public bucket. Document ALL exposed files before reporting.

Task 9: Version Fingerprinting for CVE Identification

Objective: Extract software version information for vulnerability mapping. Difficulty: ★★★☆☆

site:example.com intext:"Powered by WordPress"
site:example.com intext:"jQuery v" intext:"Copyright"
intitle:"Apache" intext:"Server at" intext:"Port 80" site:example.com
site:example.com intext:"PHP/" -intext:"documentation"

site:example.com intext:"Powered by WordPress"
site:example.com intext:"jQuery v" intext:"Copyright"
intitle:"Apache" intext:"Server at" intext:"Port 80" site:example.com
site:example.com intext:"PHP/" -intext:"documentation"

Workflow:

Extract version string from dork result
Query NVD: https://nvd.nist.gov/vuln/search?query=wordpress+5.8
Cross-reference: searchsploit wordpress 5.8
Correlate with available exploits in Metasploit: search type:exploit name:wordpress

Tier 3 — Advanced (Tasks 10–13)

Task 10: IoT Device Surface Mapping

Objective: Enumerate internet-exposed devices associated with a target organization. Difficulty: ★★★★☆

Use a combination of Google and Shodan for maximum coverage.

Google:

intitle:"Network Camera" site:example.com
intitle:"SCADA" OR intitle:"HMI" site:example.com
intitle:"phpMyAdmin" site:example.com

intitle:"Network Camera" site:example.com
intitle:"SCADA" OR intitle:"HMI" site:example.com
intitle:"phpMyAdmin" site:example.com

Shodan (supplement):

org:"Target Organization Name" port:22,23,80,443,8080,8443,9200,6379,5432,3306,27017

org:"Target Organization Name" port:22,23,80,443,8080,8443,9200,6379,5432,3306,27017

Deliverable: A device inventory table:

| IP/URL | Device Type | Firmware | Auth? | CVE |
|--------|-------------|----------|-------|-----|

| IP/URL | Device Type | Firmware | Auth? | CVE |
|--------|-------------|----------|-------|-----|

Task 11: Multi-Platform Credential Harvesting

Objective: Search paste sites, code repositories, and document hosts for credential exposure. Difficulty: ★★★★☆

site:pastebin.com "example.com" password
site:github.com "example.com" "api_key" OR "secret"
site:gitlab.com "example.com" filetype:env
site:trello.com "example.com" password
site:docs.google.com "example.com" "password"

site:pastebin.com "example.com" password
site:github.com "example.com" "api_key" OR "secret"
site:gitlab.com "example.com" filetype:env
site:trello.com "example.com" password
site:docs.google.com "example.com" "password"

Automation reference: Tools like GitDorker, trufflehog, and gitleaks automate credential scanning across GitHub. The manual dork approach surfaces what automation misses — plain text pastes and document embeds.

Task 12: Error Message Intelligence Extraction

Objective: Map internal architecture from exposed error messages. Difficulty: ★★★★☆

site:example.com intext:"Fatal error:" intext:"on line"
site:example.com intext:"SQLSTATE[" -intext:"documentation"
site:example.com intext:"ORA-01" -intext:"tutorial"
site:example.com intext:"Exception in thread" intext:"at java."
site:example.com intext:"Traceback (most recent call last)"

site:example.com intext:"Fatal error:" intext:"on line"
site:example.com intext:"SQLSTATE[" -intext:"documentation"
site:example.com intext:"ORA-01" -intext:"tutorial"
site:example.com intext:"Exception in thread" intext:"at java."
site:example.com intext:"Traceback (most recent call last)"

Intelligence extraction table:

Error Type What It Reveals PHP Fatal error Framework version, file system path, function names SQLSTATE DB type (MySQL/MSSQL/Oracle), query structure Java exception Package structure, internal class names, server type Python traceback Python version, library versions, file paths Ruby on Rails MVC structure, gem names, environment (dev/prod)

Task 13: Exposed Monitoring Infrastructure

Objective: Locate unauthenticated monitoring, logging, and CI/CD systems. Difficulty: ★★★★☆

intitle:"Grafana" inurl:":3000" site:*.example.com
intitle:"Kibana" inurl:":5601" site:*.example.com
intitle:"Jenkins" inurl:"/manage" site:*.example.com
intitle:"Prometheus" inurl:":9090" site:*.example.com
inurl:"/actuator" site:example.com                    # Spring Boot actuator
inurl:"/_cat/indices" site:example.com                # Elasticsearch
intitle:"Jupyter Notebook" site:*.example.com

intitle:"Grafana" inurl:":3000" site:*.example.com
intitle:"Kibana" inurl:":5601" site:*.example.com
intitle:"Jenkins" inurl:"/manage" site:*.example.com
intitle:"Prometheus" inurl:":9090" site:*.example.com
inurl:"/actuator" site:example.com                    # Spring Boot actuator
inurl:"/_cat/indices" site:example.com                # Elasticsearch
intitle:"Jupyter Notebook" site:*.example.com

Why this is critical: Grafana with anonymous access exposes business metrics and internal service names. Jenkins with no auth allows arbitrary code execution on build agents. Kibana exposes raw log data that may contain credentials.

Tier 4 — Expert (Tasks 14–15)

Task 14: Full Subdomain & Infrastructure Profile

Objective: Build a complete infrastructure map of a target organization using passive techniques only. Difficulty: ★★★★★

This is a multi-phase exercise combining several techniques:

Phase 1: Subdomain Harvesting

site:*.example.com -site:www.example.com -site:mail.example.com
site:*.*.example.com                                  # Third-level subdomains

site:*.example.com -site:www.example.com -site:mail.example.com
site:*.*.example.com                                  # Third-level subdomains

Phase 2: Cloud Infrastructure

site:s3.amazonaws.com "example"
site:blob.core.windows.net "example"
site:azurewebsites.net "example"
site:*.amazonaws.com "example.com"

site:s3.amazonaws.com "example"
site:blob.core.windows.net "example"
site:azurewebsites.net "example"
site:*.amazonaws.com "example.com"

Phase 3: Certificate Transparency (supplement) Query crt.sh: https://crt.sh/?q=%.example.com for all issued TLS certificates.

Phase 4: Technology Stack

site:example.com intext:"Powered by" -intext:"Google"
site:example.com intext:"Built with" -intext:"love"
site:example.com ext:js inurl:"/static/" -inurl:".min.js"   # Unminified JS

site:example.com intext:"Powered by" -intext:"Google"
site:example.com intext:"Built with" -intext:"love"
site:example.com ext:js inurl:"/static/" -inurl:".min.js"   # Unminified JS

Phase 5: Employee & Organizational OSINT

site:linkedin.com "example.com" "engineer" OR "developer" OR "DevOps"

site:linkedin.com "example.com" "engineer" OR "developer" OR "DevOps"

Deliverable: Complete infrastructure dossier with:

All discovered subdomains and their likely functions
Technology stack per property
Cloud assets and their access levels
Exposed services requiring remediation
Risk-ranked finding list

Task 15: Advanced Persistent Dork — Automated Monitoring Rig

Objective: Build a dork-based monitoring system that alerts on new exposures. Difficulty: ★★★★★

This task moves beyond manual search into operational infrastructure. The goal: automate daily dorking and alert on new results.

Toolchain:

# Install requirements
pip install googlesearch-python requests

# Install requirements
pip install googlesearch-python requests

Python skeleton:

#!/usr/bin/env python3
"""
Dork Monitor — Passive OSINT Alert System
WARNING: Rate-limit your queries. Excessive querying violates Google ToS.
Use Google Custom Search API with proper API key for production.
"""
import hashlib
import json
import os
from datetime import datetime
DORK_LIST = [
    'site:example.com filetype:env',
    'site:example.com ext:sql intext:"INSERT INTO"',
    'intitle:"index of" site:example.com',
    'site:*.example.com -site:www.example.com',
    'site:example.com intext:"Fatal error:"',
]
RESULTS_FILE = "dork_baseline.json"

def hash_results(results: list) -> str:
    serialized = json.dumps(sorted(results), sort_keys=True)
    return hashlib.sha256(serialized.encode()).hexdigest()

def load_baseline() -> dict:
    if os.path.exists(RESULTS_FILE):
        with open(RESULTS_FILE) as f:
            return json.load(f)
    return {}

def save_baseline(baseline: dict):
    with open(RESULTS_FILE, "w") as f:
        json.dump(baseline, f, indent=2)

def alert(dork: str, new_results: list):
    timestamp = datetime.now().isoformat()
    print(f"\n[!] ALERT [{timestamp}]")
    print(f"    Dork: {dork}")
    print(f"    New results detected:")
    for r in new_results:
        print(f"      → {r}")

def run_dork(dork: str) -> list:
    """
    Replace this with Google Custom Search API call.
    DO NOT use scraping against Google in production.
    API endpoint: https://customsearch.googleapis.com/customsearch/v1
    """
    # Placeholder — implement with API key
    return []

def main():
    baseline = load_baseline()
    for dork in DORK_LIST:
        results = run_dork(dork)
        current_hash = hash_results(results)
        previous_hash = baseline.get(dork, {}).get("hash")
        if current_hash != previous_hash:
            previous_results = baseline.get(dork, {}).get("results", [])
            new_items = [r for r in results if r not in previous_results]
            if new_items:
                alert(dork, new_items)
        baseline[dork] = {
            "hash": current_hash,
            "results": results,
            "last_checked": datetime.now().isoformat()
        }
    save_baseline(baseline)
    print("[*] Dork monitor cycle complete.")

if __name__ == "__main__":
    main()

#!/usr/bin/env python3
"""
Dork Monitor — Passive OSINT Alert System
WARNING: Rate-limit your queries. Excessive querying violates Google ToS.
Use Google Custom Search API with proper API key for production.
"""
import hashlib
import json
import os
from datetime import datetime
DORK_LIST = [
    'site:example.com filetype:env',
    'site:example.com ext:sql intext:"INSERT INTO"',
    'intitle:"index of" site:example.com',
    'site:*.example.com -site:www.example.com',
    'site:example.com intext:"Fatal error:"',
]
RESULTS_FILE = "dork_baseline.json"

def hash_results(results: list) -> str:
    serialized = json.dumps(sorted(results), sort_keys=True)
    return hashlib.sha256(serialized.encode()).hexdigest()

def load_baseline() -> dict:
    if os.path.exists(RESULTS_FILE):
        with open(RESULTS_FILE) as f:
            return json.load(f)
    return {}

def save_baseline(baseline: dict):
    with open(RESULTS_FILE, "w") as f:
        json.dump(baseline, f, indent=2)

def alert(dork: str, new_results: list):
    timestamp = datetime.now().isoformat()
    print(f"\n[!] ALERT [{timestamp}]")
    print(f"    Dork: {dork}")
    print(f"    New results detected:")
    for r in new_results:
        print(f"      → {r}")

def run_dork(dork: str) -> list:
    """
    Replace this with Google Custom Search API call.
    DO NOT use scraping against Google in production.
    API endpoint: https://customsearch.googleapis.com/customsearch/v1
    """
    # Placeholder — implement with API key
    return []

def main():
    baseline = load_baseline()
    for dork in DORK_LIST:
        results = run_dork(dork)
        current_hash = hash_results(results)
        previous_hash = baseline.get(dork, {}).get("hash")
        if current_hash != previous_hash:
            previous_results = baseline.get(dork, {}).get("results", [])
            new_items = [r for r in results if r not in previous_results]
            if new_items:
                alert(dork, new_items)
        baseline[dork] = {
            "hash": current_hash,
            "results": results,
            "last_checked": datetime.now().isoformat()
        }
    save_baseline(baseline)
    print("[*] Dork monitor cycle complete.")

if __name__ == "__main__":
    main()

Production enhancements:

Integrate with Google Custom Search API (cx parameter + API key)
Push alerts to Slack webhook or email via SMTP
Store results in SQLite for trend analysis
Schedule with cron (Linux) or Task Scheduler (Windows)
Add Shodan API integration for exposed service monitoring

7. Operational Security & Legal Boundaries

The Law

Dorking itself — querying a public search engine — is generally legal. However:

Action Legal Status Querying Google for public info ✅ Legal in most jurisdictions Accessing an exposed file via direct URL ⚠️ Legal grey area; governed by CFAA (US), CMA (UK), etc. Downloading exposed data you are not authorized to access ❌ Illegal under CFAA, GDPR, and equivalents Testing systems without written authorization ❌ Illegal

Key principle: Finding a URL via Google does not grant you permission to access, download, or interact with the resource. Viewing an accidentally public file may constitute unauthorized access under strict interpretations of CFAA §1030.

OPSEC for Researchers

When performing authorized assessments:

Route through authorized infrastructure: Use your engagement's designated exit node or VPN. Don't dork from personal IPs.
Log everything: Your engagement authorization, timestamps, query strings, and findings. Defense against legal blowback.
Use Google Custom Search API: Reduces fingerprinting versus browser-based searches.
Screenshot, don't download: For evidence documentation of sensitive files, screenshot the browser. Do not download files you lack authorization to access.
Rotate user agents: Automated dorking tools should rotate User-Agent strings to avoid Google CAPTCHA blocks.

Responsible Disclosure

If you discover exposed sensitive files on a system you do not have authorization to test:

Do not download or store the data
Identify the correct contact: security@domain.com, HackerOne/Bugcrowd program, or via WHOIS
Report with minimal detail initially: "I found an exposed configuration file at [URL]. It contains what appears to be database credentials."
Give reasonable remediation time: 90 days is standard per coordinated vulnerability disclosure norms
Document the discovery and your actions in case of legal questions

8. Further Resources

Essential Tools

Tool Purpose URL Google Hacking Database Pre-built dork library exploit-db.com/google-hacking-database Shodan IoT/service search engine shodan.io Censys Certificate/host search censys.io FOFA Chinese-market IoT search fofa.info ZoomEye Cyberspace mapping zoomeye.org crt.sh Certificate transparency crt.sh VirusTotal URL/domain intelligence virustotal.com theHarvester OSINT automation github.com/laramies/theHarvester SpiderFoot OSINT platform spiderfoot.net Maltego Visual link analysis maltego.com

Recommended Reading

"Google Hacking for Penetration Testers" — Johnny Long, Bill Gardner, Justin Brown
"The Web Application Hacker's Handbook" — Stuttard & Pinto
"Open Source Intelligence Techniques" — Michael Bazzell (7th edition)
PTES Technical Guidelines — pentest-standard.org
OWASP Testing Guide — owasp.org/www-project-web-security-testing-guide

Practice Platforms (Authorized Only)

HackTheBox.eu — Retired machine write-ups for OSINT practice
TryHackMe.com — Structured OSINT rooms
VulnHub.com — Downloadable vulnerable VMs
HackerOne/Bugcrowd — Real-world targets with legal authorization
OPSEC Labs — Build your own: deploy a misconfigured Docker stack locally

Closing Transmission

The hidden web is not hidden by encryption or access controls. It is hidden by assumption — the assumption that because something is hard to find, no one will find it. Dorking destroys that assumption with a search bar.

Every .env file sitting exposed in a web root is a developer who assumed their mistake would go unnoticed. Every database dump in a /backup directory is a DBA who assumed Google doesn't index file extensions. Every default-credential admin panel is a sysadmin who assumed obscurity was security.

Your job — as a red teamer, penetration tester, or security researcher — is to think like an adversary before an adversary does. Google is their first tool. It should be yours too.

Master the grammar. Query with precision. Report what you find.

The Dorking Manifesto is a living document. The Google Hacking Database grows daily. The exposed attack surface grows with it.

Until the next powerfull blog, Keep growing keep practising and BYee bYEe!!!!

Tags: #OSINT #RedTeam #Dorking #GoogleHacking #PenTest #InfoSec #Recon #BugBounty #CyberSecurity #PassiveRecon

© 2026— Red Team Operations Research | Credits: @rot-ig