No matter how robust a company's firewalls are, human error remains the ultimate vulnerability. Developers are human; they accidentally commit passwords to public repositories, misconfigure cloud storage permissions, and leave backup files on production servers.
For a bug bounty hunter, these mistakes are goldmines. Finding a hardcoded API key or an exposed configuration file can lead to critical system compromises without requiring complex exploit development. In this post, we will move beyond basic scanning and learn the advanced techniques professionals use to hunt for leaked secrets.
1. Weaponizing GitHub & Entropy Scanning
Many modern data breaches originate from public code repositories. While beginners simply type "password" into the GitHub search bar, elite hunters use automation and advanced pattern recognition.
- Entropy Scanning: Hardcoded secrets (like AWS keys or cryptographic tokens) do not always have obvious variable names. However, they do have high entropy — meaning they are highly randomized and complex strings. Tools like TruffleHog and Gitleaks specialize in scanning repositories using both Regular Expressions (Regex) and Shannon entropy to find these hidden keys.
- Gitrob: This tool uses the GitHub API to clone a target's repositories and iterates through the entire commit history, flagging files that match signatures for sensitive information.
- Real-World Case (Algolia RCE): Hacker Michiel Prins used Gitrob on Algolia's open-source
facebook-searchrepository. The tool flagged a file namedsecret_token.rbfrom a past commit. This file contained the Ruby on Railssecret_key_base, a critical value used to sign cookies. By obtaining this secret, the hacker forged a malicious cookie and achieved Remote Code Execution (RCE), earning a $500 bounty.
2. Reconstructing Exposed .git Directories
When developers use Git, a hidden .git folder is created. If a developer carelessly uploads the entire project directory to a production web server, the .git folder becomes publicly accessible, allowing attackers to download the application's entire source code.
- Basic Extraction: You can verify the exposure by visiting
https://target.com/.git/configorhttps://target.com/.git/HEAD. If directory listing is enabled on the server, you can simply usewget -r https://target.com/.gitto recursively download the repository. - Advanced Reconstruction (No Directory Listing): If directory listing is disabled, standard tools will fail. However, you can reconstruct the code manually. First, read the
.git/HEADfile, which will point to the current branch reference (e.g.,ref: refs/heads/master). Reading that reference file gives you a 40-character SHA1 hash of a Git commit object. - Git objects are stored in the
.git/objects/directory, organized by the first two characters of the hash. By requesting these specific object files (e.g.,https://target.com/.git/objects/0a/72e6850...), you can download them. Because these files are compressed, you must use a script with a library likezlibin Python or Ruby to decompress them locally and reveal the original source code.
3. Cloud Storage Pwnage (Amazon S3 Buckets)
Organizations heavily rely on Amazon Simple Storage Service (S3) to host assets. However, administrators frequently misconfigure the access control lists (ACLs), leaving buckets open to the public.
- Finding Buckets: You can identify bucket names using tools like Lazys3 or Bucket Stream, which brute-force permutations of the company's name (e.g.,
target-backup,target-media). You can also use search engines like GrayhatWarfare, which indexes publicly exposed S3 buckets. - Exploitation via AWS CLI: Once you find a bucket (e.g.,
hackerone.files), you must test its permissions using the AWS Command Line Interface. - To test read access:
aws s3 ls s3://BUCKET_NAME/. - To test write access, attempt to move a benign text file to the bucket:
aws s3 mv test.txt s3://BUCKET_NAME/test.txt. - Real-World Case (Shopify & HackerOne): Hackers discovered that both Shopify and HackerOne had S3 buckets that were publicly writable. By brute-forcing bucket names and using the
aws s3 mvcommand, security researcher Peter Yaworski proved he could write and delete arbitrary files on HackerOne's infrastructure, netting a $2,500 bounty. Warning: Always use a benign test file and never delete company data during your tests.
4. The Deep Web Archives & Paste Sites
Sometimes, the leak is not hosted on the company's infrastructure.
- Paste Sites: Developers occasionally use sites like Pastebin or GitHub Gists to share code snippets or server logs with coworkers. Because these are public, they frequently leak credentials. Automated tools like PasteHunter or pastebin-scraper can be configured to continuously monitor these sites for your target's domain.
- The Wayback Machine: Web applications evolve, and developers delete vulnerable files. However, the Wayback Machine (archive.org) keeps historical snapshots. By using a tool like Waybackurls, you can extract years of historical endpoints. Filtering these results for extensions like
.conf,.env,.bak, or.sqloften reveals legacy configuration files or database dumps that were indexed years ago and are still accessible.
By mastering these techniques, you transform from a casual scanner into a digital scavenger, capable of finding the invisible keys to the kingdom.