Overview of the TIFIA Program

Analyzing a program without easily downloadable data

Nikhil Bhandari
Nov 21 · 6 min read
Image for post
Image for post
Photo by Ricardo Gomez Angel on Unsplash

I. Introduction

II. Review of the TIFIA Program

2.1 Projects by Sector

Image for post
Image for post
Figure 1: Number of Projects by Sector
Image for post
Image for post
Figure 2: Assistance by Sector

2.2 Projects by Status

Image for post
Image for post
Figure 3: Number of Projects by Status

2.3 Projects by Instrument Type

Image for post
Image for post
Figure 4: Number of Projects by Instrument Type. Note: The NA status reflect those projects that are in the loan review process, retired, or have missing data.

2.4 Projects by Year and Sector

Image for post
Image for post
Figure 5: Number of Projects by Year and Sector
Image for post
Image for post
Figure 6: TIFIA Assistance by Year and Sector

2.5 TIFIA Assistance at Project Level

Image for post
Image for post
Figure 7: TIFIA Assistance by Project and Sector. Note that the plot displays the standard boxplot that shows the 25th percentile, the median and the 75th percentile in the middle of the plot. The ends of whiskers on either side of “interquartile range” (IQR — the 75th minus the 25th percentile) are based on Q +/- 1.5 x IQR. The dots on the figure show individual projects— they are jittered for clarity.

III. Closure

References

Web-scraper Code

# PYTHON 3.8
# program to scrape TIFIA project info
from bs4 import BeautifulSoup
import urllib.request
import pandas as pd
raw_link = "https://www.transportation.gov/buildamerica/projects/financing-search?page="
proj_list = list()
# first go thru the summary pages to get the links to
# individual project pages.
# loop thru all the pages.
for page_num in range(0,100):
page_link = raw_link + str(page_num)
print(page_link)

# get the page contents
tifia_page = urllib.request.urlopen(page_link).read()
html = BeautifulSoup(tifia_page,"html.parser")

# scrape the page.
# Relevant data is in ARTICLE
# find all the page articles
proj = html.find_all('article', class_='node__content view--item clearfix project__teaser')
print('Number of projects: ',len(proj))
for a_proj in proj:
link = a_proj.find('a')['href']
proj_list.append(link)

if len(proj) < 10:
print('reached last page: ', page_num)
break
# now loop thru individual project pages.
all_proj_dict = dict()
for a_proj in proj_list:
proj_link = 'https://www.transportation.gov' + a_proj
print(proj_link)

# get the page contents
proj_page = urllib.request.urlopen(proj_link).read()
proj_html = BeautifulSoup(proj_page,"html.parser")

# scrape the page.
# Relevant data is in ARTICLE
# each important data point is a list (li) item
arts = proj_html.find_all('article')
print('number of articles found',len(arts))
my_dict = dict()

for a_a in arts:
# get the title
a_title = a_a.find_all('h1', class_="node__title")
my_dict['Project Title'] = a_title[0].text.strip('\n')

all_fields = a_a.find_all(class_='field')
for a_field in all_fields:
a_f_l = a_field.find_all('div',class_='field__label')
a_f_i = a_field.find_all('div',class_='field__item')
if len(a_f_l) > 0:
k1 = a_f_l[0].text
v1 = a_f_i[0].text
my_dict[str(k1)] = str(v1)

# store the data in all_proj_dict
all_proj_dict[a_proj] = my_dict
# convert the list to a dataframe
v_l = list()
for k,v in all_proj_dict.items():
v_l.append(v)

df = pd.DataFrame(v_l)
# OPTIONAL - change column names
# makes life easier in subsequent steps
for a_col in df.columns:
n_col = a_col.lower().replace(' ','_').replace('/','_')
df.rename(columns = {a_col:n_col}, inplace = True)
# save the dataframe to a CSV file.
df.to_csv('tifia_data_nov2020.csv',index=False)
print('All done!')

The Startup

Medium's largest active publication, followed by +734K people. Follow to join our community.

Nikhil Bhandari

Written by

Senior Consultant specializing in data analytics, financial analysis and web based modeling. Founder of Rock Creek Analytics (nikhil@rockcreekanalytics.com).

The Startup

Medium's largest active publication, followed by +734K people. Follow to join our community.

Nikhil Bhandari

Written by

Senior Consultant specializing in data analytics, financial analysis and web based modeling. Founder of Rock Creek Analytics (nikhil@rockcreekanalytics.com).

The Startup

Medium's largest active publication, followed by +734K people. Follow to join our community.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store