theHarvester Opensource OSINT tool setup on Debian and Ubuntu

One of the best opensource OSINT tools is theHarvester. It’s a really simple yet effective tool designed to be used in the early stages of a penetration test. Use it for open source intelligence gathering and helping to determine a company’s external threat landscape on the internet. The
tool gathers emails, names, subdomains, IPs, and URLs using multiple public data
sources that include;

  • baidu: Baidu search engine

  • bing: Microsoft search engine

  • bingapi: Microsoft search engine, through the API

  • CertSpotter: Cert Spotter monitors Certificate Transparency logs

  • crtsh: Comodo Certificate search

  • dnsdumpster

  • dogpile

  • duckduckgo

And more public sources are available, there are over 20+ public sources where theharvester scrounges confidential information.

Here are some notable features of theHarvester

  • Virtual host finder
  • Subdomain finder
  • Searching emails of given domains
  • Basic port scanner
  • Social profile finder for individuals
  • Dump data in graphs to html

Setting up theHarvester on Debian and Ubuntu

This tool requires python3.7 any python version lower than that won’t work

sudo apt-get install python3-pip
sudo pip3 install virtualenv 
virtualenv -p python3 myenv
git clone
pip3 install -r requirements.txt

Or if you don’t want that hassle you can do this

pip3 install theHarvester

Using theharvester a Module

This tool is intended to be used in the commandline b just invoking

theHarvester -d -b google

But is can also be used as a module in your python scripts, that’s what we did with out online integration. Just like this

import theHarvester
from theHarvester.discovery import baidusearch
from theHarvester.discovery import bingsearch
from theHarvester.discovery import dnsdumpster
from theHarvester.discovery import googlesearch
#....and more....

# or

# from theHarvester.discovery import *

baidu = baidusearch.SearchBaidu("", 100)

# Each discovery engine has it's own method

# not all have get_emails
emails = baidu.get_emails()
hostnames = baidu.get_hostnames()

# That's how you can use theHarvester in any other python3 module.