OSINT: How to find information on anyone

Your data is more exposed than you think

Petro Cherkasets
May 7 · 20 min read

Open Source Intelligence (OSINT) — is the information gathering from the publicly available sources and its analysis to produce an actionable intelligence. The scope of OSINT is not limited to cybersecurity only but corporate, business and military intelligence or other fields where information matters.

Whether you are a recruiter, marketing manager, cybersecurity engineer or just a curious one, reading the article, you will find something useful for yourself. Maybe you want to know what data of yours is out there for others to find or just want to see if the person or the organization that contacted you online is legit. In this article, I will explain how to discover a person’s digital footprint, perform digital investigations and gather information for competitive intelligence or penetration testing.

Many OSINT tools are available nowadays so I’m not going to cover them all, only the most popular ones and those useful in the described use cases. In this guide, I showed a general approach, different tools and methods that you can use depending on the requirements and the initial data you have.

Basic OSINT steps

  1. Define requirements (what you want to get)
  2. Gather the data
  3. Analyze collected data
  4. Pivot as-needed using new gathered data
  5. Validate assumptions
  6. Generate report

Real name

IntelTechniques.com OSINT Workflow Chart: Real Name

Governmental resources

Google Dorks

  • “john doe” site:instagram.com — quotation marks force Google Search to do absolutely exact match while the search is performed on Instagram.
  • “john doe” -“site:instagram.com/johndoe” site:instagram.com — hide postings from the target’s own account, but show posted comments on the Instagram posts of others.
  • “john” “doe” -site:instagram.com — show results that exactly match given name and surname but in different combinations and exclude Instagram from results.
  • CV” OR “Curriculum Vitae” filetype:PDF “john” “doe” — search for target’s resumes that contain “CV” or “Curriculum Vitae” in the name and have a PDF extension.

Wrap single words in quotes if you are 100% sure about spelling as by default Google will try to shape your keyword to what the masses want. By the way, what’s interesting about Instagram is with the right Google Dork you can see comments and likes of private accounts.

Perform a search using advanced search queries on Bing, Yandex and DuckDuckGo as other search engines might give you results that Google couldn’t.

People search

People search websites allow to opt out but after people remove themselves from listings new search services appear with their records in them. The reason for that is the same dataset is bought and used by different services. Some companies own those datasets and even if on one of their websites person removes the listing, on the new domain the old data is repopulated again so previously removed profile reappears in the search. Consequently, if people did a pretty good at cleaning their stuff up you just have to wait for a new database to appear. One of the methods to find people that opted out is to go the people search service, find a unique paragraph, do quoted Google search on it and find all of the domains that the company owns. There are chances that information your target removed from site A is now on site B.


User name

IntelTechniques.com OSINT Workflow Chart: User Name

Firstly, we have to find a username. Usually, it is a name plus surname combination or derived from the email, domain name of the website the person uses or owns. Start with data you have and do a reverse lookup towards what you need. Obviously, the simplest way is to Google any relevant data known to you at the moment and try to find any pages with the username. Also, you can use special websites that do a reverse username search, like socialcatfish.com, usersearch.org or peekyou.com.

Google Dorks

  • inurl:johndoe site:instagram.com—search for URLs on Instagram that contain “johndoe” in them.
  • allinurl:john doe ny site:instagram.com — find pages with “john“, “doe” and “ny” words in the Instagram URL. Similar to inurl but supports multiple words.

Depending on the complexity of your search and how successful it was using previous methods you might want to generate a wordlist. It’s useful when you need to try a lot of options as you don’t have a clear picture of what username should be but have a lot of guesses. I have used the Python script for generating the wordlist below:

Name and surname were specified in Names.txt, in Terminal we just see the output

Username search

Searching for “johndoe” username on 152 sites with WhatsMyName

Searching you might get false positives as someone else can use the same username, be prepared for that.

Note: Running WhatsMyName, as well as any locally installed tool, could be an issue when you have certain websites blocked by the ISP. In that case, going through proxy or VPN will solve the issue. Moreover, to avoid exposure you should use anonymizers anyway.


Email Address

IntelTechniques.com OSINT Workflow Chart: Email Address

Google Dorks

  • HR “email” site:example.com filetype:csv | filetype:xls | filetype:xlsx — find HR contact lists on a given domain.
  • site:example.com intext:@gmail.com filetype:xls — extract email IDs from Google on a given domain.

Email tools

  • Email permutator — generates permutations of up to three domains at which target is likely to have an email address. Supports multiple variables input to generate custom results.
  • Proofy — allows bulk email validation which is useful when you generated a list of emails using permutation tool and want to check all of them at once.
  • Verifalia — validates single email addresses for free without registration. To use bulk validation you have to sign up.

Browser plugins

  • OSINT browser extension — contains a lot of useful links, including ones for email search and verification. Compatible with Firefox and Chrome.
  • LinkedIn Sales Navigator — plugin for Chrome that shows associated Twitter account and rich LinkedIn profile data directly in Gmail.

Compromised databases

Another option would be to use dehashed.com. With a free account it works similarly to Troy Hunt’s website but with the active subscription it shows passwords in clear text or password hashes. From an OSINT perspective, we need that to search whether it was used on some other websites — one more way to find out which services that the person uses or at least used. Doing the search by password or its hash shows not only on which website it was used, but also email address tied to it. Thus, we can get target’s emails we wouldn’t obtain otherwise. It’s important to note that if the password is not unique we might get false positives as other people might use it as well.


Phone number

IntelTechniques.com OSINT Workflow Chart: Telephone #

Sometimes people link phone number and email to their Facebook profile, so typing it in the Facebook search might show you the profile. Another option is to look up user-supplied databases of phone numbers, like whocalledme.com. The database is not limited only to America, numbers from Europe can be checked as well. Besides, for those who want something like this, but on the mobile device there are several apps: privacystar.com, getcontact.com, and everycaller.com. There are many reverse phone lookup services and they are usually country-specific so find the one that fits your need.

PhoneInfoga

Features:

  • Check if phone number exists and is possible
  • Gather standard information such as country, line type and carrier
  • Check several numbers at once
  • OSINT reconnaissance using external APIs, Google Hacking, phone books & search engines
  • Use custom formatting for more effective OSINT reconnaissance
Well, you can see how many resources were scanned. Definitely faster than manual search.

Android Emulator

Save the number in your phone and look at Viber or WhatsApp contact list. These services allow adding a photo, biography, name of the owner and this information can be extracted just by knowing the telephone number.

  • Bluestacks — made primarily for gamers but runs other apps as well. Available for Windows, Mac and Linux and doesn’t require a Virtual Machine to set it up so it installs easier than Genymotion.
  • Genymotion — widely used by developers but also has a free version for personal use. Works on Windows, Mac and Linux and has a range of virtual devices to choose from. Use this guide from IntelTechniques to set up the emulator.
  • AMIDuOS — available only for Windows and leverages device drivers from the system to enable near-native performance in Android. It’s fast and has a straightforward installation. However, while the aforementioned emulators can be installed for free, AMIDuOS comes at a price of $10.

Domain name

IntelTechniques.com OSINT Workflow Chart: Domain Name

If the person or an organization owns a website you have to know how to grab information about it. Its investigation might reveal the operating system being used, software version, personal contact info and more. I have to mention that it is advised to investigate without ever ‘touching’ the target’s environment, such technique is called passive reconnaissance — footprinting that involves the uses of tools and resources that can assist in obtaining more information about your target without directly interacting with it. Below I described methods of obtaining information while remaining stealthy.

Google Dorks

  • site:example.com — limits search to a particular website or domain.
  • filetype:DOC — returns DOC files or other specified types, such as PDF, XLS and INI. Multiple file types can be searched for simultaneously by separating extensions with “|”.
  • intext:word1 — search for pages & websites that contain the specific word that you are searching.
  • allintext: word1 word2 word3 — search for all the given words in a page or website.
  • related:example.com — will list web pages that are “similar” to a specified web page.
  • site:*.example.com — show all subdomains. Asterisk acts as a substitute for a whole word or words in search queries.

Whois

Reverse Whois

Same IP

Passive DNS

Internet archives and cache

There are cases when deleted pages were not archived but are still cached by search engines. They can be found on cachedview.com or you can request the cached version with the following Google query: cache:website.com. Didn’t find anything on Google? Check the cache of other search engines but keep in mind that the cache shows the last time the page was indexed. Therefore, you might get the page with missing images and outdated information.

You may also like visualping.io — a monitoring service that takes screenshots of the webpage at the selected time and sends you an email alert if something changes.

Reputation, malware and referrals analysis

  • www.siteworthtraffic.com — analyses website traffic (users, page views) and estimates how much revenue it could generate through ads.
  • www.alexa.com —analyses website traffic and it’s competitors, shows what they are doing better and gives advice on SEO improvement.
  • www.similarweb.com — analytics tool which provides deep information on website or mobile ranking, performance, the source of traffic, and more. On top of that, it does referral analysis.
  • https://sitecheck.sucuri.net — scans websites for known malware, blacklisting status, website errors, and out-of-date software.
  • www.quttera.com — offers free malware scanning and provides a comprehensive report that includes malicious files, suspicious files, blacklisted status and more.
  • www.urlvoid.com — helps you detect potentially malicious websites. Also, it gives more information about the domain (IP address, DNS records, etc.) and cross-references it against known blacklists.

IoT search engines


Location search

IntelTechniques.com OSINT Workflow Chart: Location

Geolocation tools

IP-based Geolocation

Useful websites

  • http://snradar.azurewebsites.net — search for geotagged public posts VKontakte and filter them by date.
  • http://photo-map.ru — allows to search geotagged VKontakte posts, as a previous service, but requires authorization.
  • www.earthcam.com — the global network of owned and operated live streaming webcams which might be useful during location research.
  • www.insecam.org — a directory of online security cameras. The coordinates of the cameras are approximate and point to the ISP address and not the physical address of the camera.

Images

The image itself contains a lot of useful information, like the camera information, geocordinates, etc. — it’s called EXIF data and if it wasn’t removed you might find a lot of interesting info. For example, map geocordinates to find out where the picture was taken or get camera serial number and look if there are pictures taken with that camera on the internet, there is a special service for that — stolencamerafinder.com. Image editing tools allow to view metadata, if you don’t need to install a complex program, Exiftool — the cross-platform free software might the thing you are looking for. The third option is to view EXIF data online: exifdata.com or viewexifdata.com. To remove EXIF data you can use a locally installed tool: exifpurge.com or do it online: verexif.com.

Do you need to perform image forensics and find out if the image was tampered with? Use Forensically or FotoForensics online tools. If you don’t want to upload an image online — Phoenix or Ghiro can be run locally. The latter is more automated and gives you more functionality than the above mentioned online tools. Apart from that, working with images you might need to deblur it or improve the quality, so here are some enhancement tools:

  • Smartdeblur — restores motion blur and removes Gaussian blur. Helps to restore focus and do image improvements which deliver amazing results.
  • Blurity — focuses only on deblurring images, doesn't provide such many options like the previous tool and available only on Mac.
  • Letsenhance.io — enhance and upscale images online using AI.

SOCMINT

Facebook

  • Facebook Sleep Stats — estimates sleeping patterns based on users online/offline status.
  • lookup-id.com — helps you to find the Facebook ID for a profile or a group.

Twitter

  • TweetDeck — gives you a dashboard that displays separate columns of activity from your Twitter accounts. For example, you might see separate columns for your home feed, your notifications, your direct messages, and your activity — all in one place on the screen.
  • Trendsmap — shows you the most popular trends, hashtags, and keywords on Twitter from anywhere around the world.
  • Foller — gives you rich insights about any public Twitter profile (profile public information, number of tweets and followers, topics, hashtags, mentions).
  • Socialbearing — free Twitter analytics & search for tweets, timelines & twitter maps. Finds, filters, and sorts tweets or people by engagement, influence, location, sentiment and more.
  • Sleepingtime — shows the sleeping schedule of Twitter public accounts.
  • Tinfoleak — shows devices, operating systems, applications and social networks used by the Twitter user. Also, it shows places and geolocation coordinates to generate a tracking map of locations visited. Maps user tweets in Google Earth and more.

Instagram

LinkedIn

  • LinkedInt — scrapes e-mail addresses of employees in a selected organization. Supports automated e-mail prefix detection for a given company domain name.
  • ScrapedIn — a Python script that scrapes profile data and imports it into XLSX file (intended to be used with Google Spreadsheets).

Automating OSINT

SpiderFoot

theHarvester

Recon-ng

Maltego

FOCA

Metagoofil


Conclusion

While this article was more about intelligence gathering, the next one will be about the preparation phase. Let me know in the comments if there is something specific you want to know about preparing an investigative environment. By the way, what tools and techniques do you use to gather intelligence?