Google Hacking

shaistha fathima
6 min readJun 28, 2019

--

Google Search Engine Hacking using the advanced search techniques.

Last Links Update 21/8/2021

Google hacking, also known as google dorking, is an information gathering technique used by leveraging the advanced Google search techniques. If used efficiently, it can be used to identify security vulnerabilities in web applications, gather information for arbitrary or individual targets, discover error messages disclosing sensitive information, discover files containing credentials and other sensitive data.

This can be achieved by using advanced search operators as a part of the search query and hence, refining the query to get the desired results. For example:

Syntax -> operator:search_termsite:wikipedia.com intitle:"learning"

Note: There is no space between the advanced search operator and the query, separated by the “:”.

The above query will search for site i.e., domain name, to be wikipedia.com and “learning” as a part of its title. The result will look something like this:

Logical operators and symbols in Google Search

These operators and symbols help refine the search query. Each have a special meaning of their own and at times can help create complex queries when combined with advanced search operators.

Advanced search operators

As mentioned above, advanced search operators help refine the search query with it as a part of the query. Black hat presentation by Johnny Long on Google Hacking for Penetration Testers is a good document for reference.

List of the advanced operators and when to use them.

Some examples to demonstrate the use of Advanced Search Operators are as follows:

Example 1: Depicting how to find a specific page via google search with the below given requirements and how to identify them.

Site, domain name, as johnny.ihackstuff.com; with filetype: “php” in the url.
Searching for filetype “php”; with title “i hack stuff” and text “navigate” having a number within a range of “99999–100000”
Combining all the above shown queries together to find that specific page via google search.

Example 2: Another example showing the effectiveness of advanced search operators.

Just by knowing the site, keywords you need in the url, and the filetype the below results can be obtained.

For better understanding refer to Google Hacking for Penetration Testers , by far the best resource I have found on Google hacking or can also refer to this video by JackkTutorials, (Update: JackkTutorials have now been blocked on YouTube!)

Other Useful Resources:

  1. Step-by-step guide: Google hacking to test your security
  2. Google Hacking Diggity Project
  3. Google Search Help
  4. Google Hacking Pentesting Tool
  5. Google Hacking Database
  6. Defend yourself from Google Hacker
  7. Google Hacking: How to save yourself from Google Dorking

So, why should you know about Google Hacking?

As stated in a blog by John Jolly, ((Note: The blog have now been updated and a lot of previous content have been modified!)

when an attacker knows the sort of vulnerability he wants to exploit but has no specific target, he employs a scanner. A scanner is a program that automates the process of examining a massive quantity of systems for a security flaw. The earliest computer-related scanner, for example, was a war dialer ; a program that would dial long lists of phone numbers and record which ones responded with a modem handshake.

Today there are scanners that automatically query IP addresses to see what proxy for exploits. A proxy is an intermediary system that an attacker can use to disguise his or her identity. For example, if you were to gain remote access to XYZ’s computer and cause it to run attacks on treasury.gov, it would appear to the Feds that XYZ was hacking them. His computer would be acting as a proxy. Google can be used in a similar way.

The search engine has already gathered this information and will give it freely without a peep to the vulnerable site. Things get even more interesting when you consider the Google cache function. If you have never used this feature, try this:

Do a Google search for “SearchTechTarget.com.” Click on the first result and read a few of the headlines. Now click back to return to your search. This time, click the “Cached” link to the right of the URL of the page you just visited. Notice anything unusual? You’re probably looking at the headlines from yesterday or the day before. Why, you ask? It’s because whenever Google indexes a page, it saves a copy of the entire thing to its server.

This can be used for a lot more than reading old news. The intruder can now use Google to scan for sensitive files without alerting potential targets — and even when a target is found, the intruder can access its files from the Google cache without ever making contact with the target’s server. The only server with any logs of the attack would be Google’s, and it’s unlikely they will realize an attack has taken place.

An even more elaborate trick involves crafting a special URL that would not normally be indexed by Google, perhaps one involving a buffer overflow or SQL injection. This URL is then submitted to Google as a new Web page. Google automatically accesses it, stores the resulting data in its searchable cache, and the rest is a recipe for disaster.

How can you prevent Google hacking?

Make sure you are comfortable with sharing everything in your public Web folder with the whole world, because Google will share it, whether you like it or not. Also, in order to prevent attackers from easily figuring out what server software you are running, change the default error messages and other identifiers. Often, when a “404 Not Found” error is detected, servers will return a page like that says something like:

Not Found
The requested URL /cgi-bin/xxxxxx was not found on this server.
Apache/1.3.27 Server at your web site Port 80

The only information that the legitimate user really needs is a message that says “Page Not found.” Restricting the other information will prevent your page from turning up in an attacker’s search for a specific flavor of server.

Google periodically purges it’s cache, but until then your sensitive files are still being offered to the public. If you realize that the search engine has cached files that you want to be unavailable to be viewed you can check this site and follow the instructions on how to remove your page, or parts of your page, from their database.

References:

https://www.acunetix.com/websitesecurity/google-hacking/

--

--

shaistha fathima

ML Privacy and Security Enthusiast | Research Scientist @openminedorg | Computer Vision | Twitter @shaistha24