How to scrape Google results into a spreadsheet: SerpApi Google Sheets plugin

Elizabeth Oster
10 min readJun 30, 2020

--

Getting Started:
SerpApi is for getting data from pages like on Google’s search engine. That data is pulled from the webpage and stored into a file which the user can then parse for the specific bits of information they were looking for. Generally, you’d need to know how to code to write an implementation of SerpApi to get the data from the Google results pages. SerpApi now has a plugin for Google Sheets that allows you to do all of this without having to know how to code!

You will also need a SerpApi account and private API key to utilize the SerpApi Google Sheets plugin. You can sign-up for an account and get your API key here: https://serpapi.com.

The Basics:
Before any data can be pulled off of any webpage, we first need to know how to make the page to pull data from. This is done through search parameters that are in the URL link for the search page.

Take this URL for example:

https://www.google.com/search?q=Coffee

It will be a Google results page for the word Coffee. Looking back at the URL, notice the search term in the URL as Coffee and it is set equal to “q” like this:

q=Coffee

This is the search parameter for the keyword and it’s how Google knows to give us the page for Coffee. And if we are wanting to search for Cake, we can replace Coffee in the URL with Cake and it should look something like this:

https://www.google.com/search?q=Cake

And it does work as intended, we get a page for the keyword Cake because that is what we put as the keyword parameter, “q”.

Google searches are made up of lots of parameters like “q” which specify things like language, location, and a variety of other ways to tailor a search. Google searches are comprised of more than just the keyword and all of these parameters can affect the results you get. If you do a Google search yourself, you’ll notice that the URL is much longer than that one we were just working on. Take this search for Cake for example:

https://www.google.com/search?ei=wU-PXuusK5C2tQXBzI2gBA&q=Cake&oq=Cake&gs_lcp=CgZwc3ktYWIQAzIECAAQQzIGCAAQChBDMgcIABCDARBDMgQIABBDMgQIABBDMgQIABBDMgQIABBDMgUIABCDATIECAAQQzICCAA6BAgAEEc6BwgAEEMQiwM6CAgAEIMBEIsDShQIFxIQMGcxMzFnMTYyZzQ2ZzExNEoNCBgSCTBnMWcxZzNnMVCWwwFYvcsBYITNAWgAcAJ4AIABkQGIAbUDkgEDMy4xmAEAoAEBqgEHZ3dzLXdpergBAg&sclient=psy-ab&ved=0ahUKEwirkfaV5NvoAhUQW60KHUFmA0QQ4dUDCAs&uact=5

You can see there is a lot more here than just that “q” parameter. Most of these added parameters are for things like tracking on Google’s part and aren’t relevant to the results on the website and can hence aren’t important for our API.

You can find all of the other parameters in this URL by looking for the “&” symbol. This symbol is used to mark where one parameter ends and the next begins. The keyword parameter was “q” and we know it is just “q” because that was everything before the “=” which denotes what the parameter is being set to. So the parameter itself will be between the “&” and the “=” and everything after the “=” and before the next “&” is what the parameter is set to.

In this URL you can find a parameter called “ved” and it is set to “0ahUKEwirkfaV5NvoAhUQW60KHUFmA0QQ4dUDCAs”.

Not all parameters will be clear as to their purpose like the “q” parameter. In the case of “ved”, this is for tracking and isn’t useful for our API.

One parameter that is useful for our API is a parameter called “num”. Normally, a Google search will include 10 results on a page. But with the “num” parameter, we can tell Google to serve more than 10 results per page up to 100. Let’s use this with this URL from earlier:

https://www.google.com/search?q=Cake

Now we can add the “num” parameter onto it with an “&” and set it to “100” with an “=”:

https://www.google.com/search?q=Cake&num=100

Now if you scroll through this page, you’ll see that you now have 100 results on a single page!

All parameters can be added in the same way. A good place to find new parameters and test with them is on our playground:

https://serpapi.com/playground

Here you can see what each parameter is called and see what they do. All of these parameters can be used to create the searches you are looking for!

Using the SerpApi Google Sheets Plugin:

The SerpApi Google Sheets Plugin adds two new custom functions and a variety of menu options. There are 2 main ways to utilize the SerpApi Google Sheets Plugin and they are based off of what you are wanting to use the API for. Each of these ways has their own function.

The first is for search rankings. This is to find where a particular website ranks for a search. This is a common use for SEO.

The second is for specific information on the page. This can be the titles and links of the results to the total number of results reported and much more. If it’s directly on the page and you want it, this is the method for you.

1) Search Rankings:

The custom function for search rankings is:

=SERPAPI_RANK(“Query”,”Domain”)

This function takes two parameters, Query and Domain. You can think of Query as the parameters that make the search page and Domain as the website you are wanting to get a rank for.

The Query is composed of the search parameters for making the desired search page. This is identical to what was covered earlier in The Basics.

Let’s start with the search we left off with in The Basics:

https://www.google.com/search?q=Cake&num=100

We are using two search parameters here, “q”and “num”. These are what we need so we can get rid of the rest:

q=Cake&num=100

Now that we have our search parameters, we need to specify which search engine we are wanting to conduct the search on. In this example, we are using Google. We can specify that by adding it as if it were a search parameter called “engine”:

q=Cake&num=100&engine=google

Once we have our search parameters and the search engine, the Query is finished and we can move onto the Domain.

The first result on that page is for a Wikipedia page. Let’s use “en.wikipedia.org” for the example to see if everything is working correctly.

Now that we have the Query and Domain, we can put them into the function like so:

=SERPAPI_RANK(“q=Cake&num=100&engine=google”,”en.wikipedia.org”)

You’ll notice that both the Query and Domain are put in quotes and separated with a comma. This is necessary for the function to work correctly.

Now if we run this function in our Google Sheet with the SerpAPI Plugin, we should get back “1” because it’s the first results on the page:

And now a “1” is returned into the cell because en.wikipedia.org is the first result on the page. If the Domain isn’t on that page at all, a “-” will be returned to the cell to indicate the result wasn’t there.

To get the most out of the SERPAPI_RANK function, you can use the menu options to automate getting your ranks.

Navigate to “Add-ons” then “SerpApi” then “Google” and finally select the menu option“Add result formula”

Once we have our search parameters and the search engine, the Query is finished and we can move onto the Domain.

The first result on that page is for a Wikipedia page. Let’s use “en.wikipedia.org” for the example to see if everything is working correctly.

Now that we have the Query and Domain, we can put them into the function like so:

- =SERPAPI_RANK(“q=Cake&num=100&engine=google”,”en.wikipedia.org”)

You’ll notice that both the Query and Domain are put in quotes and separated with a comma. This is necessary for the function to work correctly.

Now if we run this function in our Google Sheet with the SerpAPI Plugin, we should get back “1” because it’s the first results on the page:

And now a “1” is returned into the cell because en.wikipedia.org is the first result on the page. If the Domain isn’t on that page at all, a “-” will be returned to the cell to indicate the result wasn’t there.

To get the most out of the SERPAPI_RANK function, you can use the menu options to automate getting your ranks.

Navigate to “Add-ons” then “SerpApi” then “Google” and finally select the menu option“Add result formula”

Once selected, you’re sheet will look like this:

The ranks might be different when you are setting things up but we can see both some pages where “starbucks.com” was ranked on and some where it wasn’t.

We also see that column A is for the Query and column B is for the Domain while column C contains the ranks. The function is actually only being run in column C using columns A and B to automate the process while keeping both the Query and Domain visible.

The functions than ran in C looked like this:

=SERPAPI_RANK(A2, B2)
=SERPAPI_RANK(A3, B3)
=SERPAPI_RANK(A4, B4)
=SERPAPI_RANK(A5, B5)
=SERPAPI_RANK(A6, B6)

Using the SERPAPI_RANK function like this is a great way so set your parameters once instead of every time for every rank.

Now if we navigate back to the SerpApi/Google menu, we can select “Add today’s rank”:

Once selected, our sheet will look like this:

Now we have a new column with new ranks titled with today’s date.

Utilizing these menu options can help automate this process so you can come back and easily get the new ranks.

2) Search Result Page Data:

The custom function for getting data off of the search result page is:

=SERPAPI_RESULT(“Query”,”Selector”)

This function takes two parameters, Query and Selector. You can think of Query as the parameters that make the search page and Selector as the location in the JSON file where to find the bit of data you are looking for.

The Query is composed of the search parameters for making the desired search page. This is identical to what was covered earlier in The Basics.

Let’s start with the search we left off with in The Basics:

- https://www.google.com/search?q=Cake&num=100

We are using two search parameters here, “q”and “num”. These are what we need so we can get rid of the rest:

q=Cake&num=100

Now that we have our search parameters, we need to specify which search engine we are wanting to conduct the search on. In this example, we are using Google. We can specify that by adding it as if it were a search parameter called “engine”:

q=Cake&num=100&engine=google

Once we have our search parameters and the search engine, the Query is finished and we can move onto the Selector.

The Selector refers to where in the JSON file the bit of data is. So first we need to understand a bit about these JSON files.

SerpApi returns all of the data from the search page in a JSON file. JSON files can store data in an organized way so that specific parts can be pulled easily.

Here is part of a JSON file for a search done with SerpApi:

“search_metadata”:

{“id”: “5cc470041befcd91e24b126a”,“status”: “Success”,“created_at”: “2019–04–27 15:06:44 UTC”,“processed_at”: “2019–04–27 15:06:44 UTC”,“google_url”: “https://www.google.com/search?q=Coffee&oq=Coffee&uule=w+CAIQICIaQXVzdGluLFRleGFzLFVuaXRlZCBTdGF0ZXM&hl=en&gl=us&sourceid=chrome&ie=UTF-8",“total_time_taken”: 2.59}

You can see at the top we have “search_metadata” and below it are some lines with text that are indented. This indentation means that these lines are under “search_metadata”.

For this example, the bit data we are looking to get is the “id” for this search which is “5cc470041befcd91e24b126a”. We know this because “5cc470041befcd91e24b126a” is inline with “id” and after the colon.

We know that the line with “id” is indented under the line with “search_metadata”. This means that “id” is found within “search_metadata”. So if we are wanting to get to “id”, we first must go through “search_metadata”.

If we think of this in terms of levels, it would be something like this:

  • Level 1: search_metadata
  • Level 2: id

JSON files cannot skip levels so when creating the Selector for the SERPAPI_RESULT function. We need to know all of the levels it takes to get to the bit of data we want.

Since we know all of the levels to get to “id”, we can start assembling the Selector. We start from level 1 and end with the level for our bit of data while separating each with a “.”:

  • search_metadata.id

The “id” is only 2 levels deep so we only have two levels split with a single “.”

Now that we have both our Query and our Selector we can put them into the function like so:

=SERPAPI_RESULT(“q=Cake&num=100&engine=google”,”search_metadata.id”)

You’ll notice that both the Query and Selector are put in quotes and separated with a comma. This is necessary for the function to work correctly.

Now if we run the function, we should get the “id” for that search:

And we correctly get the id. (Yours will be different as each search ID is unique)

Just like for search parameters, the best place to find the JSON selector you are looking for is on the playground: https://serpapi.com/playground

On the bottom right you can see a preview of the JSON for that search. You can use this to find the selector for the bit of data you need as well as the levels above it so you can complete the function.

Wrapping up:

We made this Google sheets plugin after regularly having people come to us wanting to be able to use our API but didn’t have any coding knowledge themselves. This is generally a hard gap to bridge and we hope the Google sheets plugin is able to do this for you.

If you experience any issues or need help getting things set up, feel free to email us at contact@serpapi.com and we’ll help get things sorted for you.

--

--