Web Scraping in R

Published in

The Startup

5 min readSep 20, 2020

Objective: To fetch the latest product prices that have been hosted on the competitor’s website programmatically.

For the purpose of demonstration, let’s look into the websites of WeWork and Regus; two leading players in the co-working industry who competes among each other to serve hot desks, dedicated desks, and private offices across the globe. Let’s try to scrap their websites in California to retrieve the latest product price listings programmatically.

There were four milestones to accomplish the objective:

Web scraped Regus sites using httr/rvest packages.
Cleaned the dataset and incorporated geospatial coordinates.
Repeated steps 1 & 2 for WeWork websites.
Embedded R script in Power BI and visualized the final output.

Phase 1: Web scraped Regus sites using httr/rvest packages

Step 1.1. Imported Libraries: Imported all the relevant libraries upfront.

library(tidyverse)
library(rvest)
library(revgeo)
library(“opencage”)
library(dplyr)
library(sqldf)
library(formattable)
library(stringr)
library(ngram)
library(httr)
library(rlist)
library(jsonlite)
library(lubridate)
library(splitstackshape)

Step 1.2. Regus Location API: Extracted the co-working locations in California from Regus…

Web Scraping in R

Written by Sree