dorking (how to find anything on the Internet)

tl;dr: Use advanced Google Search to find any webpage, emails, info, or secrets

cost: $0

time: 2 minutes


Software engineers have long joked about how much of their job is simply Googling things.

Now you can do the same, but for free.

Below, I'll cover dorking, the use of search engines to find very specific data:

  1. webpages
  2. emails
  3. files
  4. SEO
  5. coupons!
  6. secrets
  7. operator review


#1 - webpages

Inspired by this Twitter exchange with Gumroad CEO Sahil Lavingia, the next few examples will cover Gumroad and Sahil. For each example, you can paste it directly into Google to see the result

find specific pages within a website (ex: for DynamoDB e-books)

site:gumroad.com dynamodb

find specific pages that must include a phrase in the Title text

allintitle:"support this" site:gumroad.com

find similar sites (Google only)

related:gumroad.com

you can chain operators together (ex: looking for bug bounties with either security or bug-bounty in the URL)

(inurl:security OR inurl:bug-bounty OR site:hackerone.com) + "gumroad"

you can restrict to certain top-level domains (ex: lists of teachers)

site:.edu filetype:xls inurl:"email.xls"


#2 - emails

find Gmail accounts

sahil lavingia "@gmail.com"

find work accounts (you'll need to find their domain first)

sahil lavingia "@gumroad.com"

not finding what you're looking for with either of those? Try to guess the format of the email (try going to this site, search the domain, and click Identified Name Formats))

"s.lavingia" "@" ".com"

you can always find every page with emails on it (and then use the next snippet below)

site:gumroad.com intext:"@gumroad.com"

find every email on a web page that you're on. The big kahuna - this works for every website. Inject it into a site with Chrome DevTools (more here)

var elems = document.body.getElementsByTagName("*");
var re = new RegExp("(^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$)");
for (var i = 0; i < elems.length; i++) {
    if (re.test(elems[i].innerHTML)) {
        console.log(elems[i].innerHTML);
    }
}

this will log every email found, without you having to scan through the whole page.


#3 - files

find spreadsheets

filetype:csv OR filetype:xlsx OR filetype:xls OR filetype:xltx OR filetype:xlt OR inurl:airtable.com/universe/

find Google Docs and Google Sheets

site:docs.google.com "gumroad"

find where your competitor's logo is (ex: partners or customers' websites)

"Gumroad Logo.png"

find your competitors' sales pitches and whitepapers

site:intercom.com (filetype:pdf OR filetype:ppt)


#4 - SEO

find sites with specific keywords in the anchor text

inanchor:"cyber security"

research blog posts with specific keywords in their title

inposttitle:"diy slime"

find backlinks (ex: other sites that link to a particular blog post)

link:https://blog.gumroad.com/post/189293637718/gumroad-now-auto-enables-https-on-custom-domains

find keyword permutations (AROUND(int) will try to find the given phrases within the int worth of words )

graphic design AROUND(2) tools

find companies using a given widget

intext:"Powered by intercom" -site:intercom.com


#5 - coupons

search the site itself for codes

site:curology.com ("coupon" | "referral code" | "affiliate code" | "discount code" | "VIP")

next, try twitter

site:twitter.com + "meundies" + ("coupon" | "referral code" | "affiliate code" | "discount code" | "VIP")

next, try Mailchimp emails

site:campaign-archive.com + "blueapron" + ("coupon" | "referral code" | "affiliate code" | "discount code" | "VIP")


#6 - secrets

cybersecurity experts use dorking, as one tool among many, to find potential vulnerabilities in a company. I will not be covering any such queries, out of concern for their potential for misuse.


#7 - operator review

operators are components of a search query that narrow the results down. You can combine as many as you want in one query. The most useful ones you'll want to know are:

operator description
"phrase" results must include "phrase"
~phrase search for phrase and synonyms
-phrase exclude results with phrase
phrase1 + phrase2     similar to AND
phrase1 | phrase2 similar to OR
site:example.com results must be on domain example.com
filetype:jpg results be of type .jpg

AND/OR logic can be controlled with parentheses:

("phrase1" OR "phrase 2") AND "phrase3"
# equivalent to these two searches
>> "phrase1" AND "phrase3"
>> "phrase2" AND "phrase3"


Thanks for reading. Questions or comments? 👉🏻 alec@contextify.io