Scrape Google search results with Node JS

4 min readJul 5, 2022

In this post, we will learn to scrape Google search results using Node JS.

Requirements:

To scrape Google Search Results, we need to install some libraries we need in this tutorial:

Node JS
Unirest JS — To extract the HTML data of the target URL.
Cheerio JS — To parse the extracted HTML data.

Before starting, we have to ensure that we have set up our Node JS project and installed npm packages — Unirest JS and Cheerio JS. You can install both packages from the above link.

Target:

Process:

Now, we have set up all the things to prepare our scraper. We will make a get request to our target URL to get the raw HTML data. We will use an npm library Unirest JS to get our HTML data and Cheerio for parsing the extracted HTML data.

Then we will search for HTML tags for the respected titles, links, snippets, and displayed_links.

So, according to this image, the tag for the title is .g .yuRUbf h3 , for the link is .yuRUbf a , for the snippet is .g .VwiC3b and for the displayed_link is .g .yuRUbf .NJjxre .tjvczx.

Here is our code:

const unirest = require("unirest");
const cheerio = require("cheerio");

const getOrganicData = () => {
  return unirest
    .get("https://www.google.com/search?q=javascript&gl=us&hl=en")
    .headers({
      "User-Agent":
        "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.54 Safari/537.36",
    })
    .then((response) => {
      let $ = cheerio.load(response.body);
      console.log(response.status)
      let titles = [];
      let links = [];
      let snippets = [];
      let displayedLinks = [];

      $(".g .yuRUbf h3").each((i, el) => {
        titles[i] = $(el).text();
      });
      $(".yuRUbf a").each((i, el) => {
        links[i] = $(el).attr("href");
      });
      $(".g .VwiC3b ").each((i, el) => {
        snippets[i] = $(el).text();
      });
      $(".g .yuRUbf .NJjxre .tjvcx").each((i, el) => {
        displayedLinks[i] = $(el).text();
      });

      const organicResults = [];

      for (let i = 0; i < titles.length; i++) {
        organicResults[i] = {
          title: titles[i],
          links: links[i],
          snippet: snippets[i],
          displayedLink: displayedLinks[i],
        };
      }
      console.log(organicResults)
    });
};

getOrganicData();

However, a single User Agent might not be enough for scraping Google as it can block your IP address from further requests. So, we need more User Agents, which we will rotate on every request.
Here is how we can do this:

const selectRandom = () => {
 const userAgents =  ["Mozilla/5.0 (Windows NT 10.0; Win64; x64)  AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36",     "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36",     "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.157 Safari/537.36",     "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36",     "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.45 Safari/537.36",     "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.71 Safari/537.36",     
];
var randomNumber = Math.floor(Math.random() * userAgents.length);     return userAgents[randomNumber];     
}     
let user_agent = selectRandom();     
let header = {
     "User-Agent": `${user_agent}`
  }

Pass this header variable in the header function of Unirest then you will be able to rotate User Agent on each request.

Note: You can also use a proxy server while making a get request with Unirest JS. For example:

return unirest
.get("https://www.google.com/search?q=javascript&gl=us&hl=en")    .headers(header)
.proxy("your proxy");

Here “your proxy” refers to the proxy server URL you will use for making requests. The proxy server will hide your IP address which means the website won’t be able to identify your IP address while making the request by which you can save your IP from being blocked by Google.

Results:

With Google Search Results API:

Serpdog | Google Search API supports all major featured snippets like knowledge graphs, answer boxes, top stories, etc. Serpdog also offers 100 free requests on the first sign-up to its users.

Scraping can be time-consuming sometimes, but the pre-cooked structured JSON data can save you time.

const axios = require('axios');axios.get('https://api.serpdog.io/search?api_key=APIKEY&q=javascript&gl=us')
  .then(response => {
    console.log(response.data);
  })
  .catch(error => {
    console.log(error);
  });