Python

Web Scraping: A Quick Case Study

Scraping a ‘table’ from a webpage

Aren Carpenter
The Startup
Published in
5 min readJan 22, 2021

--

A recent passion project exploring the data of Broadway demanded some data that I was having trouble tracking down. But then, success! I found a webpage with a table of earnings and performance information by Broadway show! The problem? How to get that data into a usable format.

The answer? Time to scrape! While web scraping is an amazing tool in the data scientist’s toolbox, it wasn’t something I had much exposure to, especially ‘in the wild’.

I hope this guide will be useful for those looking to get into web scraping or to build some more experience.

Step 1: Requests

We’ll start by importing some packages we will use throughout the project.

from bs4 import BeautifulSoup
import requests
import pandas as pd

We’ll use the requests package to send a GET request to the URL defined below, which returns a requests.Response() object containing the response.

URL = "https://www.broadwayworld.com/grossescumulative.cfm"
r = requests.get(URL) # send the GET request to the URL

Now we have the server’s response in this “r” variable, and we can use the BeautifulSoup package to parse the content of…

--

--