Download Open Census Data & Visualize Neighborhood Insights

SafeGraph
SafeGraph
Published in
5 min readMay 17, 2019

Great Machine Learning Runs on Great Data

One of the biggest obstacles to innovation in the machine learning industry is the availability of clean, high-quality datasets. That’s why Safegraph is committed to getting as much data out there in front of businesses and researchers.

One way SafeGraph has already made data more accessible is through offering points-of-interest, store visitor insights, and foot-traffic data available for purchase at low cost through a self-serve data portal.

Open Census Data is another step towards democratizing access to high-quality data.

Downloading Census Data From The Census Bureau Is Too Difficult

While census data is offered for free on the Census Bureau website, it isn’t as open and easy to use as it may look on first glance.

Nick Singh, who heads growth at Safegraph, explains:

“Accessing data on the Census Bureau website is a cumbersome process. The UI is confusing to use. You have to do a sequence of steps 50 times for each of the 50 states to get the data at the lowest granularity. Easy bulk access isn’t supported.”

Some of the steps needed to download Census data at the Census Block Group level.

The challenge with downloading Census data is exemplified by the GIS StackExchange question: “Where to get 2010 Census Block data?”.

The most upvoted answer leads with:

“It is on the new version of American Factfinder and don’t feel bad, even Census Bureau employees are confounded by the new site.”

The answer goes on to list 8 steps, and this only gets you part of the data.

SafeGraph’s Open Census Data Gives Bulk Access To All ACS 2016 Data At The Census Block Group Level

SafeGraph’s Open Census Data contains 7500+ demographic attributes (like income, age, education, etc.) available at the Census Block Group level. All data from the American Community Survey is available in bulk with a clean schema and joined with Census Block Group (CBG) geometry.

SafeGraph’s open census dataset includes the following components:

  • All demographic data from the American Community Survey (2016) 5-year estimate on the census block group level.
  • All census block group boundaries formatted as a GeoJSON file.
  • Metadata mapping attribute names to a table ID, census block groups to cities and counties, and census block groups to geographic statistics such as percentage land and water.
  • SafeGraph neighborhood insights for each census block group

This dataset can help you answer questions such as:

  • How do economic characteristics, such as income and employment, differ by neighborhood?
  • How do housing characteristics, such as occupancy and rent paid, differ by neighborhood?
  • How do demographic characteristics, such as age, race, and sex differ by neighborhood?
  • How do social characteristics, such as marital status and education, differ by neighborhood?

The schema and documentation for the dataset can be found here.

Other Ways To Work With & Visualize Open Census Data

The entire dataset is available for download for free at SafeGraph’s Open Census Data page or on Kaggle.

Safegraph has also created an interactive map that illustrates the data and allows for easy exploration.

Free SafeGraph Patterns Data At The Census Block Group Level Included

In addition to the Census data, Safegraph has also included a version of SafeGraph Patterns at the neighborhood level (census block group). This dataset answers questions such as:

  • What are the most popular brands in a neighborhood? Are there regional preferences for some brands over others?
  • How do people travel between neighborhoods? How does the distance traveled compare for suburban and urban communities?
  • What times do people visit certain census block groups (ex. Manhattan during the day vs. night)?
  • Which neighborhoods are the most mobile? Which neighborhoods receive the most outside visitors?

The free open neighborhood analytics dataset is a less granular version of SafeGraph’s premium Places Patterns dataset. SafeGraph’s premium Patterns dataset reports data at a “place” (store location) level. The free open neighborhood analytics has data at the Census Block Group (CBG) level, which covers roughly 600–3000 households.

So instead of reporting distance traveled to a store or top related brands for a specific place like in the Places Patterns dataset, the free neighborhood insights show how far people traveled to reach a neighborhood and the brand preferences for a whole community.

How Organizations Are Using Open Census Data & Neighborhood Insights

Ryan Fox Squire, who works on Product and Data Science at Safegraph, points out an interesting use case for real estate analytics:

“One use case is a team of people at a big retail company using the data to decide where they should open a new store for their company. A big part of these analyses involves demographic data for candidate neighborhoods. Traditionally, you only look at what are the demographics of people who live physically near the new location.

But SafeGraph’s neighborhood insights in combination with census data lets you analyze not only who lives in this locality but also who travels to be near the candidate location. For example, most people spend a lot of time and money in places near where they work during the day and not near their homes.

Without knowing which communities people commute to, you have an incomplete picture for retail analyses. Neighborhood patterns and open census data helps give one the whole picture”

Another use case is how Neoway leveraged SafeGraph Patterns along with the Open Census data to help consult a major beverage-maker on how to optimize its product-mix at restaurants, bars, and stores, based on each location’s unique profile and demographic mix.

A data scientist on Kaggle used the neighborhood insights to understand what were the most popular brands visited across all neighborhoods.

Want more data?

SafeGraph Places has building footprints data and business listing info for 5 million places (Points-of-Interest) in the U.S. — almost every place you can spend money, from top retail brands to small Mom-and-Pop stores.

SafeGraph Places Patterns is a dataset of insights, such as distance traveled and top home CBG, for visitors to these Points-of-Interest. By combining US Census demographic data, with Places Patterns, one can get detailed demographic insights on a given store’s visitors.

Both datasets are available to preview & purchase on SafeGraph’s Data Bar.

--

--