Google’s Advanced Search Operators: filetype & site vs. source

Justin O'Hara
Feb 13 · 3 min read

An examination and exploration of the advanced search operators filetype and the difference between site vs source and how they work within Google’s search engine.

Google’s search operator filetype: and site: being fully exported and highlighted on SerpApi

Following up my last two articles regarding advanced search operators: AROUND(X) and intext vs. allintext & inurl vs. allinurl. I have decided to explore the effectiveness of the filetype, site, and source.

Side by side comparison of Google Searches results for “caffeine filetype:txt ” and “caffeine ext:txt”

In my initial research for filetype I saw that you can interchange the search operator for “filetype” to “ext”. As you can see the results are next to identical in top results but there is a “People also search for” block of text right after the first organic_result. In the following experimentation I will stick to “filetype” for continuities sake. Google is very dynamic and results can change instantaneously. Also it is important to note that while very similar, the filetype and ext can change a search result order.

The first result for filetype:pdf’s was accessible:

Side by Side Google “caffeine filetype:pdf” and the link for that particular `organic_result`

While the first result for filetype:docx was not:

Side by Side Google “caffeine filetype:docx” and the link for that particular `organic_result`

Filetype usage opens up my eyes to manipulating Google results to retrieve. Much like changing the tab on Google to use the Images search tab, this transforms the portal to get any kind of filetype that could be useful to your search, granted there isn’t security issues with getting the results.

The next two advanced search operators I wanted to explore was the difference between site and source.

Side by Side Google results for “caffeine source:nytimes” and “caffeine site:sprudge.com”

So I switched the advanced operators with the site and source that they were pulling from:

Side by Side Google results for “caffeine source:sprudge” and “caffeine site:nytimes.com”

One thing I noticed when getting these results and interchanging was that there had to be a top level domain while using “site”:

Google results for “caffeine site:nytimes”

The site advanced operator is a bit strict in syntax to provide results while source is a bit more loose. You can leave the top level domain on source and receive the same results.

Side by Side Google results for “caffeine source:sprudge” and “caffeine source:sprudge.com”

Experimenting with the the filetype and extension search operator definitely opened up my eyes to the possibilities of results that you could potentially retrieve. It is though a search engine was placed on top of the Google Search to get pdf’s, docs, txt, etc.

The difference between source and site led me to my conclusion that while both have their purpose, I would be more willing to use “source” over “site” to open up results and not be contingent on top level domains.

May the filetype, extension, source and site advanced search operators continue you to sharpen your googling skills and provide you with a myriad of different results.

You can sign-up for SerpApi here: https://serpapi.com/

You can find the SerpApi user forum here: https://forum.serpapi.com/

You can find the API documentation here: https://serpapi.com/search-api/

Fast, complete, and easy API to scrape and extract search…