Web Scraping Weather Data With Selenium Webdriver

Avoid being blocked by wunderground.com

Nov 18, 2020 · 5 min read
Image for post
Image for post


from bs4 import BeautifulSoup
import requests
url = 'https://www.wunderground.com/hourly/us/ny/new-york-city/KNYNEWYO1335'
page = requests.get(url)
soup = BeautifulSoup(page.content, 'html.parser')

If you cannot pay the API you have to simulate that you are accessing from a known browser user agent (i.e. Chrome).


My use case

Import libraries

Declare variables

Add argument ‘headless’ to Chrome options

options = webdriver.ChromeOptions();

Create an instance of ChromeDriver

driver = webdriver.Chrome(executable_path='./chromedriver.exe', options=options)

‘While’ loop

while start_date != end_date:
print('gathering data from: ', start_date)
formatted_lookup_URL = lookup_URL.format(start_date.year,
# SCRAPE AND STORE DATA start_date += timedelta(days=1)

Open the URL in the background


Scrape data

Image for post
Image for post

Iterate through the elements inside ‘rows’

Image for post
Image for post
Browser inspector
Image for post
Image for post

The Startup

Medium's largest active publication, followed by +775K people. Follow to join our community.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store