April 7, 2020

Kali Linux :: Email Harvesting

(Last Updated On: 31st January 2017)

In this post, I will show you a tool in Kali Linux that’s able to harvest email addresses that are publicly available on the internet.

Why do I want to harvest emails?

When working with a company to run a phishing campaign against you (for testing purposes), or indeed a hacker running a malicious phishing campaign, they will want to discover what genuine email addresses are publicly available on the internet. This is so they can compile a list of emails they will run their campaign against – and in some circumstances, they may use this information for more targeted campaigns; known as spear-phishing.

Running this for yourself will allow you to see what is out there, and the tool I will use in this example will check against search engines and potentially social networks to get these results.


The tool I will be using is called “theHarvester” and comes pre-installed on KaliLinux. It can harvest more than just emails, and can be used to collect sub-domains, employee names, open ports and banners.

Using theHarvester

To get started, we will open up a terminal windows and type in “theharvester” – this will open up the application and show us what options we have.

[email protected]:~# theharvester
 * *
 * | |_| |__ ___ /\ /\__ _ _ ____ _____ ___| |_ ___ _ __ *
 * | __| '_ \ / _ \ / /_/ / _` | '__\ \ / / _ \/ __| __/ _ \ '__| *
 * | |_| | | | __/ / __ / (_| | | \ V / __/\__ \ || __/ | *
 * \__|_| |_|\___| \/ /_/ \__,_|_| \_/ \___||___/\__\___|_| *
 * *
 * TheHarvester Ver. 2.7 *
 * Coded by Christian Martorella *
 * Edge-Security Research *
 * [email protected] *
 Usage: theharvester options

-d: Domain to search or company name
 -b: data source: google, googleCSE, bing, bingapi, pgp, linkedin,
 google-profiles, jigsaw, twitter, googleplus, all

-s: Start in result number X (default: 0)
 -v: Verify host name via dns resolution and search for virtual hosts
 -f: Save the results into an HTML and XML file (both)
 -n: Perform a DNS reverse query on all ranges discovered
 -c: Perform a DNS brute force for the domain name
 -t: Perform a DNS TLD expansion discovery
 -e: Use this DNS server
 -l: Limit the number of results to work with(bing goes from 50 to 50 resu lts,
 google 100 to 100, and pgp doesn't use this option)
 -h: use SHODAN database to query discovered hosts

 theharvester -d microsoft.com -l 500 -b google -h myresults.html
 theharvester -d microsoft.com -b pgp
 theharvester -d microsoft -l 200 -b linkedin
 theharvester -d apple.com -b googleCSE -l 500 -s 300

Now all we need to do is construct our command – we start with “theharvester” so Kali knows what application we’re calling, and then we use the switches as listed above. When running the search against synack.co.uk, because we’re soo poorly indexed by Google, we have no results for our email, so I will use bbc.co.uk as an example and see what we get.

I’m using the following command:

[email protected]:~#¬†theharvester -d bbc.co.uk -l 500 -b google

To break this down a little:

  1. This is the domain we want to return the results for
  2. This is the number of search results we will limit ourselves to – the results being returned by the selected search engine. If you don’t set this, the default is 100.
  3. This is the search engine of choice, in this case – Google; we do however have the option to use a many number of services (as seen in “theharvester’s” first output)

And here are our results!

[-] Searching in Google:
 Searching 0 results...
 Searching 100 results...
 Searching 200 results...
 Searching 300 results...
 Searching 400 results...
 Searching 500 results...
 [+] Emails found:
 [email protected]
 [email protected]

[+] Hosts found in search engines:
 [-] Resolving hostnames IPs...

Oh…. not as many results as we were hoping for… Well, when I ran this same command, but replacing “-b google” with “-b all” – I found that a lot more results were returned, I’m just not so sure the BBC would appreciate me posting a large number of their emails to this site; even if they are publicly available. The preferred outcome for you would be to find as fewer emails as possible – and this output merely serves as an example of what to expect from this application.

I hope you have found this post useful, and as ever – please take a look at our other posts.




Previous «
Next »

Jake is a security engineer working in West Yorkshire. He has experience with various firewall vendors including FortiGate, Check Point, Cisco and Palo Alto.

Leave a Reply

Subscribe to SYNACK via Email

%d bloggers like this: