A Data Analysis Of Consumer Financial Protection Bureau Complaints From 2016
In November, Mick Mulvaney was appointed Acting Director of the Consumer Financial Protection Bureau. While Mulvaney has stated he will use his time as bureau director to protect Americans from “burdensome regulations,” many supporters of the CFPB see his mission there as neutering and dismantling the agency. But what if Mick Mulvaney were to use the agency’s considerable powers to address the financial concerns of the forgotten men and women who Trump purports to represent?
To investigate this question, I used the CFPB’s consumer complaint database, which the CFPB makes public. The database is an incredibly rich data resource, and includes information about the company, product and nature of the problem. It also includes the filer’s zip code- enough information to tell us about the financial issues that affect a state or county.
Knowing someone’s place of residence doesn’t predict who that person votes for. But as a recent analysis by 538 argues, counties have become increasingly polarized towards blue or red, with little in between (in fact, less than 10% of counties were won by a single-digit margin in 2016). So it makes sense to look at county-level data as a rough proxy to look at the financial issues affecting the parts of the US whose votes carried Trump to victory in the election.
I decided to use location and election data to take all the complaints from 2016- about 105,000- and split them into counties that went for Clinton or for Trump. To do this, I joined the complaints to a ZIP-county relationship file from the Census and county-level election data (with a little cleaning help from SQL) and analyzed it all in Tableau.
To start, below is a map of every county for which there was CFPB complaint data in 2016. (This is a good reminder that land area does not equal population.)
Here are some of the insights I found in the data:
Trump-voting counties on the Eastern seaboard and the Gulf Coast appeared to be the most active users of CFPB complaints.
As you can see, the Eastern seaboard from New York down to Florida had the most active counties per capita. Areas in the Gulf coast out to Texas had some very active counties too. (Squaring the number of complaints was a quick and easy way to get a broader spread, as I explain further below.)
Similarities and differences between counties: debt collection, mortgages and bank accounts.
There were twice as many complaints from Clinton counties as Trump counties, showing that the Bureau could use to do some more outreach to heavily Republican areas. It is worth noting, though, that complainers from either place were equally satisfied with the results they obtained, from the similarly low percentages of consumers who disputed the resolution of their complaints.
Trump and Clinton counties were very similar in what they complained to the CFPB about. However, as this treemap chart shows, consumers in Trump counties complained more about debt collection, and less about mortgages and bank accounts, than voters in Clinton-voting counties.
Trump counties seem to have more problems with medical and credit card debt.
Debt collection is one of the industries that showed the most notable differences between Trump counties and Clinton counties in the types of products complained about. The bar chart above looks at the complaints about debt collection, separated by the type of debt. Consumers in Trump counties complained about credit card and medical debt at noticeably higher rates, while consumer in Clinton counties complained more about “other” products such as phones and health clubs.
Conclusion: Potential explanations
The data that I’ve chosen to highlight seems to confirm one of my working hypotheses when I first set out on this project: the differences in things that consumers in Trump and Clinton counties complained about (and their underlying causes, in the case of debt collection) seem like typical urban-rural differences. More problems with debt collection, or defaulting on medical bills, could have to do with the divides in wealth, access to healthcare, or other similar factors. If the CFPB wants to help Trump’s base, it might be well-served to look at the financial implications of these issues.
This explanation doesn’t tell the whole story, though, and it would be well worth comparing the complaint data to other data on public health or urban/rural classifications.
Stay tuned for more.
About Transformations to the Data
I used three main datasets to put this viz together.
- A CSV of all the CFPB complaint data from 2016
- US Census Zip Code Tabulation Area to County Relationship File
- County-Level election results, scraped from townhall.com and posted to Github
The ZCTA-County relationship file was a valuable tool for converting zip codes to counties, as it contains county (in FIPS format) as population and other useful info. To use it, I had to deal with the issue of duplicate zip codes in the table. Zip codes can fall across multiple counties, and when this happens, the ZCTA table has multiple entries for that zip, for each county in which it falls. To avoid double counting complaints in zip codes where this happens, I used a SQL query to keep the entry for the county where the majority of that zip code’s inhabitants live and strip out the others. (This is a rough way to do it, but only a small minority of zip codes fell across multiple counties, so any complaints put it into the wrong county would make up a very small percentage.) The resulting CSV, called zipconverter, can be found on my project page on data.world
I then joined the CFPB complaint file, the zipconverter, and the county-level election data within Tableau, using inner joins. I then used analyzed and visualized the resulting data within Tableau.
About the heat map: Squaring the number of complaints was a quick and easy way to get a broader spread and get around the problem of counties with extremely low population whose one complaint put them in the top rank. As I add more years of data to this project, I hope to phase it out.
The full interactive charts are are on my Tableau Public page.
The consumer complaint data and the ZIPcode converter are on my data.world page.
County-level election data is from Townhall.com via Github.