Open-Sourcing Thumbtack’s Economic Sentiment Data

By: Sujin Oh

Thumbtack Engineering
Thumbtack Engineering
4 min readMar 24, 2017

--

Photo by Markus Spiske on Unsplash

Today, we’re very happy to announce that we’re open sourcing the results from Thumbtack’s monthly Economic Sentiment Survey (ESS) series. The ESS captures the attitudes and perspectives of thousands of business owners across the country to gauge how they feel about the economy and their businesses. Now in its fifth year, this survey provides a unique vantage point on the economy, as respondents are largely mobile service professionals with five or fewer employees who operate in households across the United States. Because they are hard to reach, these professionals are frequently overlooked in other surveys of small businesses.

With Thumbtack’s Economic Sentiment Survey data, we can seek to understand relationships between small business sentiments and various economic indicators including unemployment rates and inflation. In addition, our data can provide color to federal and local policy discussions on topics that affect small business owners such as healthcare and local industry regulations.

In this post, we’ll share how you can access the data via our API and start analyzing it in R. But if you’re a Python user, don’t worry — we’ve posted a separate, Python-specific tutorial on our GitHub repo. And, of course, if you do use our data, we ask that you cite us properly.

Before we start, it’s important to point out that the data can be cut in a variety of ways: by state, month, industry, or demographic group. To get a comprehensive overview of the scope of our data and find out more about our survey methodology, we recommend reviewing our full documentation on GitHub. For now, we’ll focus on how to quickly get up and running in accessing the various data we’re now publishing.

How to access ESS data via R

Step 1. Request the data from ESS API

The first thing to note is that our API stores the data in JSON format, so to get it into R, you’ll need to use the httr and jsonlite packages, which provide us with efficient, generalizable functions to pull data from web APIs and get them into the right format for analysis in R.* The httr package allows you to send HTTP requests and receive HTTP responses directly from R, while the jsonlite package allows you to convert json objects to R objects.**

So, as a starting point, load both libraries: httr and jsonlite. Then, use the GET() function from httr to make a request to our API.

Step 2. Check status of the request

Run the variable with the response saved to check the status code or you can apply the status_code() function to your response variable.

If you’re pulling data via multiple iterations, such as sentiment scores by Age, then implementing either warn_for_status(x) or stop_for_status(x) will display a message or break the loop, respectively if there is an issue with the status code. This is useful to include to catch any errors as soon as they occur.

Step 3. Retrieve content of the request

If the status is OK (200), then proceed to retrieve the contents of the request as a JSON string using content() as type text.*** Formats to retrieve the content include “raw”, “text”, and “parsed”. While you can use “parsed” to retrieve an auto parsed R object, in this example, we use “text” to retrieve the content as a character vector.

Step 4. Convert content of JSON string to an R object

Then convert the JSON string you’re working with to an R object using fromJSON() function. This function takes a JSON string, URL, or file.

Step 5. Assign index labels to each data pull (optional)

If you pulling data with a certain demographic cut, you want to make sure to index each data pull to keep track of each iteration. For example, if you pull state sentiment scores by Gender, make sure to assign a new column such as ‘index’ to that pull and specify ‘Male’ or ‘Female’ (refer to the data dictionary for correct index assignment).

Curious to Learn More About The Data?

Check out the ESS survey website for an interactive view of this data and monthly summaries on economic sentiments of Thumbtack Pros. For more examples on how to access the ESS Data via R or Python, check out our tutorials for both on GitHub.

Notes

*For a more comprehensive guide of accessing web data in R, “ A quickstart guide to httr”, written by Hadley Wickham, is great resource to check out. It provides an overview of the various functionalities of httr beyond the ones discussed in this post.

** Another way we use the httr package at Thumbtack is to implement Google authentication via oauth tokens for users accessing Shiny dashboards that are only for internal audiences.

*** A complete guide to HTTP status codes is available at http://www.restapitutorial.com/httpstatuscodes.html.

Originally published at https://engineering.thumbtack.com on March 24, 2017.

--

--

Thumbtack Engineering
Thumbtack Engineering

We're the builders behind Thumbtack - a technology company helping millions of people confidently care for their homes.