Image for post
Image for post

Scraping Google Maps Using Selenium and Python

Rohan Goel
Dec 28, 2020 · 6 min read

SELENIUM is a free and open-source automated testing framework used to validate web applications across different browsers and platforms but is also used as a scraping tool like scrapy and beautiful soup.

In this article, we will be looking at how we can use Selenium to scrape all the Location Data from Google Maps using the URL of the location.

So without further adieu let’s get started…

Quick Note: Any data collected from websites can be subject to copyright, thus meaning that should not be reused without owner consent and that should not be definitely used for commercial purposes. The main objective of this article is to show how to collect data as a coding exercise, as well as a way to build datasets for research and/or personal projects.

Items Scraped

  • Average Rating
  • Total Reviews
  • Address of Location
  • Phone Number
  • Website URL
  • Open and Close Time for each Day
  • Busy Percentage for each hour of the Day

2. Reviews Data

  • Reviewers Name
  • Reviewed Date
  • Reviewed Text
  • Reviewed Rating

Let’s Get Started

Step 0: Bird’s Eye View

  • Reach onto the Location Page on Maps and get the location data,
Image for post
Image for post
  • Get Open and Close Time for each day,
Image for post
Image for post
  • Get the busy percentage data for each day,
Image for post
Image for post
  • Click the “More Reviews” button and go onto all reviews page of the location,
Image for post
Image for post
  • Scrolling the Page to load all the reviews because the page is implemented using AJAX (Asynchronous Javascript and XML) which means all the reviews will only be loaded onto the website when we scroll down to look for them.
  • Click the “more” button for large reviews to load them completely on the website.
  • Finally, scrape the Reviews Details from the website
Image for post
Image for post

Step 1: Installation

$ pip install selenium
  • Now download the Google Chrome WebDriver, which is basically a piece of software that automatically runs the browser instance and over which Selenium will work.
  • #Note: Download the same version as your Chrome Browser.
  • Add the downloaded .exe file to your current working folder.

Step 2: Import Essential Libraries

Step 3: Create a main class and initialize

  • The __init__ function is the constructor that will automatically get called and initialize these necessary parameters.

Step 4: Get Location Data

Image for post
Image for post
Image for post
Image for post
  • The self.driver.find_element_by_… are the Selenium functions that automatically find out the HTML elements with that class name or id name and stores them into a variable, and later we can use the text() function over those variables to get the respective values.

Step 5: Get Open & Close Times

Image for post
Image for post
  • The class “lo7U087hsMA__row-header” contains all the days and “lo7U087hsMA__row-interval” contains the respective open and close times.

Step 6: Get Busy Percentage for each day

Image for post
Image for post
  • The variable “a” is a list of all the days, then we loop through “a” and find out all the times available in that day and store it into list “b”, then loop in b and find out the busy percentage for that respective hour in a day and store it in our final data list.

Step 7: Click the all reviews button

  • To do that we will find the All reviews button on the HTML and use the selenium .click() function to click it and get redirected to that page.
Image for post
Image for post
  • The selenium WebDriverWait function basically tells selenium to wait until that element gets loaded into the Html.

Step 8: Load all reviews

  • Let’s create a scroll page function that will first scroll and load all the reviews before we further proceed to scrape reviews.
  • The above code will scroll the page 5 times, which means it first brings the scroll bar to the bottom, waits for new reviews to load, and then again scroll it to the bottom.

Step 9: Expand long reviews

  • So let’s create a expand all reviews function that will find all these more buttons on the already loaded page and clicks them to load the entire reviews.
  • The element is the list of all those buttons present on the loaded page.

Step 10: Scrape Reviews Data

Image for post
Image for post

Final Step

Image for post
Image for post

Full Code

Conclusion

  • And with this, we have written our entire Python-Selenium script that just using the Location URL can scrape its entire data from Google Maps.
  • Feel free to ask your doubts and queries regarding this article in the comments section.
  • Connect with me on LinkedIn.

The Startup

Medium's largest active publication, followed by +771K people. Follow to join our community.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store