Pandas Datareader & Federal Reserve Economic Data (FRED)

Published in

The Startup

5 min readJul 6, 2020

The St. Louis Federal Reserve FRED Database of Economic Data

Federal Reserve Economic Data (FRED) is an incredible resource for economic data maintained by the Federal Reserve Bank of St. Louis. There are lots of time series categories to choose from like gross domestic product, interest rates, unemployment, and many others. It is easy to navigate and all of the data can be downloaded in several different formats.

FRED allows users to download selected data in a variety of formats

However, compiling lots of different economic indicators on a regular basis can be made even faster by using pandas datareader to pull data from FRED without having to visit the website to download files.

"I think just about everyone doing short-order research, trying to make sense of economic issues in more or less real time, has become a FRED fanatic." — Paul Krguman

First, make sure that Jupyter Notebook is installed. I created a prior tutorial that explains this process in slightly greater detail. Second, ensure that the following packages are installed:

pandas
numpy
datetime
requests
lxml
matplotlib or plotly (if interested in plotting FRED data visually)

All of these packages, in addition to datareader, can be installed by launching Command Prompt in Windows and using pip to install them.

pip install pandas-datareader

Launch Jupyter Notebook and create a new notebook and import the necessary packages.

import pandas_datareader as pdr
import pandas as pd
import datetime

Any set of data on FRED can be accessed using unique identifiers. The code for total nonfarm payrolls in the United States PAYEMS, is the one I’m using. It helps to become familiar with these identifiers in FRED and to create an index of which ones you find important. I have a Word document with all of the codes I use often with an accompanying brief description.

Use datetime to create a start and end point then use datareader to pull the data from FRED and create a dataframe. In this case, I want PAYEMS data from May 2005 to June 2020. Since this is monthly data, the day is not important and would return the same results whether I chose the first day of the month or the last.

start = datetime.datetime (2005, 5, 1)
end = datetime.datetime (2020, 6, 1)

df = pdr.DataReader('PAYEMS', 'fred', start, end)

To confirm, use display(df) to view the dataframe.

One particular quirk of FRED payroll data is that it’s reported in thousands. So 137,802 is actually ~137 million. To reflect the “real” value of these numbers, just multiply the dataframe by a thousand using df = df*1000 and it will convert to millions.

Notice that something extraordinary seems to have occurred beginning in March and April of 2020 — a large drop-off. It is clear that the number has gone from ~152 million to ~130 million but I want to plot this change to see the relative severity and how it compares to past data points. Because the theme of this tutorial is speed, I use plotly.express to create quick plots with minimal code.

import plotly.express as px

With plotly.express I can now create a simple line plot and use fig.show() to see it.

fig = px.line(df, y='PAYEMS',
              title='Total Nonfarm Payrolls 2005-2020',
              labels={'PAYEMS':'All Nonfarm Employees'})

Plotly express is dynamic and interactive

I am also interested to see how these changes occurred on a month-to-month basis. I can observe that payrolls dropped substantially in April 2020 but I want the exact amount.

To convert the PAYEMS data into monthly change in payrolls, all it takes is the .diff() function.

df = df['PAYEMS'].diff()

The entire series now shows the monthly changes rather than absolute values.

From here, I want to make a quick bar chart to see the enormity of what occurred.

fig = px.bar(df, y='PAYEMS',
             title='Monthly Change in Payrolls, 2005-2020',
             labels={'PAYEMS':'Monthly Payroll Change'})

I also want to save my dataframe to a .csv file so that I can upload it to Flourish and create visualizations that can be embedded on Medium.

df.to_csv('payrolls.csv')

This will save the identified dataframe as a .csv file so that it can be used elsewhere (like Tableau, Excel, Stata, etc).

FRED is a treasure trove of economic information that would be so much more difficult to amass individually. It is a resource no economist, statistician, or general enthusiast of economic data should go without. Using pandas datareader to compile data from FRED makes the process of discovery faster and easier so that more time can be spent on amazing projects.

BONUS: Pulling multiple series into one dataframe

I have been closely watching the split between temporary and permanent layoffs in nonfarm payroll data recently to properly gauge what effect COVID-19 is having on the labor market. This data is available on FRED separately under these two identifier codes.

To gather two sets of data into one dataframe is a simple process.

start = datetime.datetime (2005, 5, 1)
end = datetime.datetime (2020, 6, 1)

df = pdr.DataReader(['LNS13023653', 'LNS13026638'],
                    'fred', start, end)

df = df*1000

display(df)

The snippet above identifies the date ranges and data to be gathered, converts the employment numbers to millions, and displays the dataframe.

Even the biggest FRED nerds alive won’t know what LNS13023653 & LNS13026638 are supposed to mean.

There is only one outstanding issue left — the column names. They have to be renamed to denote what they actually represent.

df = df.rename(columns={'LNS13023653':'Temporarily Laid Off',
                        'LNS13026638':'Permanently Laid Off'})

I can save to a .csv file if I want to work on this data elsewhere, I can continue analyzing it for different derivations, or I can use plotly.express to create a quick plot of what I have.

fig = px.line(df, y=['Temporarily Laid Off',
                     'Permanently Laid Off'],
              title='Permanent vs Temporary Layoffs',
              labels={'value':'Total',
                      'variable':'Status'})

fig.update_layout(hovermode='x')

fig.show()

To reiterate, speed is the key. Time is a finite resource so saving time compiling data allows for more time to be spent on analysis and creating impressive visualizations.

Pandas Datareader & Federal Reserve Economic Data (FRED)

BONUS: Pulling multiple series into one dataframe

Written by Steve Younessi