Web Scraping Key Economic Indicators

Using JSON & Pandas To Gather Information

3 min readAug 23, 2020

There is a plethora of financial data available nowadays and seemingly even more places to source that data from. There are countless different methods to go about gathering data, many of which require third party API’s which must be installed to your system in order to make the necessary API calls. In this quick notebook walkthrough, we will demonstrate how to perform a simple JSON web scrape to fetch the data and then organize it into a pandas DataFrame. We will then use the Python library Plotly to visualize the indicators.

Importing Libraries

import pandas as pd
import requests 
import json
import plotly.graph_objects as go

Next, we must write our web scrape function. We will be using the third party API from DB-nomics (db.nomics.world). The API will return our data in an HTML format via a URL which means we first need to convert it to a JSON format. We use the requests library to do this conversion. At this point, the JSON file is organized as a data dictionary, which means we need to index into the dictionary to grab the actual data. We index this information and organize it into three variables, one for the time-series index (periods), one for our actual data values (values), and one for the over-arching dataset which will be used to wrap our final DataFrame. We then return the DataFrame “indicators”.

Define Web Scrape Function

def scrapeindicator(url):
     
     r = requests.get(url)
     r_json = r.json()
     periods = r_json['series']['docs'][0]['period']
     values = r_json['series']['docs'][0]['value']
     dataset = r_json['series']['docs'][0]['dataset_name']     indicators = pd.DataFrame(values, index = period)
     indicators.columns = [dataset]     return indicators

Utilize Indicator Scrape Function

Now that we have the function defined, we can utilize the function to begin scraping our data. Utilizing the DB-nomics API we can search for any global indicator we’d like. Below, we scrape six different indicators for Europe: European 10 year yields, unemployment rates, interest rates, inflation rates, annual GDP growth rates, and monthly changes in retail growth rates.

euro_yields_10y = scrapeindicator('https://api.db.nomics.world/v22/series/Eurostat/irt_euryld_m/M.EA.INS_FWD.CGB_EA.Y10?observations=1')unemployment = scrapeindicator('https://api.db.nomics.world/v22/series/Eurostat/une_rt_m/M.NSA.TOTAL.PC_ACT.T.EA19?observations=1')interest = scrapeindicator('https://api.db.nomics.world/v22/series/Eurostat/ei_mfir_m/M.NSA.NAP.MF-LTGBY-RT.EU28?observations=1')inflation = scrapeindicator('https://api.db.nomics.world/v22/series/WB/WDI/FP.CPI.TOTL.ZG-EU?observations=1')GDPgrowth = scrapeindicator('https://api.db.nomics.world/v22/series/WB/WDI/NY.GDP.MKTP.KD.ZG-EU?observations=1')monthly_change_retail_trade = scrapeindicator('https://api.db.nomics.world/v22/series/Eurostat/sts_trtu_m/M.TOVT.G47.CA.PCH_SM.EA19?observations=1')

Our resulting Monthly Retail Growth % DataFrame:

We now have all of our data scraped, so it’s time to use Plotly for visualizations:

# Instantiate a Plotly graph fig = go.Figure()# Add Interest Rates (EU) Trace
fig.add_trace(go.Scatter(x = interest.index,
                         y = interest['Interest rates - monthly data'],
                         name = 'Interest',
                         line_color = 'deepskyblue',
                         opacity = 0.8))# Add Unemployment Rates Index
fig.add_trace(go.Scatter(x = unemployment.index,
                         y = unemployment['Unemployment by sex and age – monthly data'], 
                         name = 'Unemployment', 
                         line_color = 'red', 
                         opacity = 0.8))# Add European Yields (10Y)
fig.add_trace(go.Scatter(x = euro_yields_10y.index, 
                         y = euro_yields_10y['Euro yield curves - monthly data'], 
                         name = 'Euro Yields - 10Y', 
                         line_color = 'green', 
                         opacity = 0.8))# Add Inflation
fig.add_trace(go.Scatter(x = inflation.index, 
                         y = inflation['World Development Indicators'], 
                         name = 'Inflation',
                         line_color = 'purple', 
                         opacity = 0.8))# Add GDP Growth
fig.add_trace(go.Scatter(x = GDPgrowth.index, 
                         y = GDPgrowth['World Development Indicators'], 
                         name = 'GDP Growth', 
                         line_color = 'pink', 
                         opacity = 0.8))# Add Monthly Retail Change in Volume
fig.add_trace(go.Scatter(x = monthly_change_retail_trade.index, 
                         y = monthly_change_retail_trade['Turnover and volume of sales in wholesale and retail trade - monthly data'], 
                         name = '% Monthly Change Volume Sales', 
                         line_color = 'black', 
                         opacity = 0.8))# Edit Attributes of plot
fig.update_layout(xaxis_range = ['2003-07-01', '2020-12-31'],
                  title_text =  "Interest Rates, Unemployment, 10y yields, inflation UE, volume sales", xaxis_rangeslider_visible = True)fig.show()

We are left with our resulting plotly visualization which is very dynamic and shows our progression. Stay tuned to the next post where we will automate the function and tie it into the DB-nomics third party API on a live connection.

Web Scraping Key Economic Indicators

Using JSON & Pandas To Gather Information

Importing Libraries

Define Web Scrape Function

Utilize Indicator Scrape Function

Written by Andrew Cole