Detroit’s Housing Data Experiment

Rebecca Rosen
5 min readMar 10, 2020

--

I recently moved from Detroit, MI to Brooklyn, NY. Both of these cities are known for their cultural richness, as well as changes in population resulting in problematic community disruption and displacement. As an effort to better understand these trends, I decided to explore what sorts of variables could predict significant real estate trends.

Note : this post will cover the first 3 steps of my experimentation process. Subscribe to my stories to see more about the results and next steps!

Open Data Research

Starting off, I was curious about how a wide range of features effected residential housing in Detroit. I wanted to see if I could predict hosing vacancy using indicators such as registered rentals, reported major crimes, or tax foreclosures. All of this information I easily found on Detroit’s Open Data Portal, but the largest hurdle I faced with this was finding an appropriate join key — some tables included zip codes, some just had city name, and others only included census tract. This sort of variety of specifying information makes it very difficult to merge tables and integrate findings.

A photo showing the evolving complexity of the census tract splits — specifically in RI. Public Source

I spent a good amount of time searching for resources that could clearly relate census tract and zip code, but found no information stored in tables — the most useful I found was a map of the city, outlining the areas of each census tract (see above for an example). It would be an option to write down this information into my own table, but another hurdle was that census tract boundaries change for each new census. Therefore, if I was working with one table over multiple years — or many tables over multiple years — even a comprehensive map of all the years would take a lot of time to manually organize. As such, I dropped some features that I was most curious about — such crime rates and tax foreclosure s— and instead took a look at the tables with most ubiquitous primary keys.

Deciding Final Features and Target Variable

After looking at each of the features, I decided to focus on the number of blighting citation tickets as my target variable. This is because of two main reasons: what the concept implies about real estate in the region, and how the specific feature was organized. Below are extrapolations upon these justifications.

Blight -

Urban blight, or “urban decay” is “the sociological process by which a previously functioning city, or part of a city, falls into disrepair and decrepitude” (Wiki). Detroit has known amongst some for its “ruin porn”, one of its features that has drawn in number of urban explorers over the years. While markedly controversial in its naming and exploitation, the phenomenon reflects interest in the city’s stark contrast of grandiose buildings and seemingly abandoned disrepair. This atmosphere, along with economic prospects brought in largely by Quicken Loans’ investment in the area, is something that has been attracting folks from around the world to move into the city.

Interior of St. Agnes Church in Detroit, MI. Source

On more neighborhood level, the concept of blight marks not the upswing of tourism, but the likelihood of static housing activity. A neighborhood that is “run down” — as defined by the presence of these blighting citations — may be less attractive for real estate development. For these reasons, I felt that tracking blight would implicitly track some of these other pressing questions of a shifting real estate in Detroit.*

Additionally, the table with blight citations tracked exactly how many citations a lot had received, and the total amount of accumulated fines. Joining this with the table of all registered lots in Detroit allowed me to consider this problem as a multi-classification scenario- how many citations- as well as binary classification problem- any citation or none.

*Who/what these developments may be disrupting are relevant questions, and I hope this study might help inform future critique and action toward justice. Please explore Riverwise for some community activism happening in Detroit now.

Final table -

All of this culminated in the organization of my final table — features in all caps are straight from raw data, features in lower case were ones that I engineered from the raw data:

Features of my final table - name of column on the left, description on the right

Developing Hypotheses

I was considering and seeing the emergence of two different archetypal Detroit home-owners : one who lives there, and one who buys a home from outside the city . The latter is a common and growing trend that either tracks, or leads to further gentrification. This lead me to my second curiosity — the citizens buying the homes, and the ones renting them out. There are different hurdles to integrating “transplants” into any new community, but do these two personas interact differently with regard to housing maintenance?

I am really wondering : is gentrification really better for economic development? But if we can assume that economic development is reflected by the real estate prospects of a collection of well-maintained lots, what can we learn about the effects of gentrification with this dataset? Is it the case that rental properties receive less blighting citation than non-rental properties? And similarly, are Detroit homeowners treating their homes better than out-of-town homeowners?

As an extension of these musings, my hypotheses were as follows:

  1. Rental properties will have more blighting citations than non-rental properties
  2. Detroit homeowners will have less blighting citations than “foreign” owners

My results were somewhat surprising, with the data analysis being doubly engaging — with 20 features and 381,353 rows, there was a lot to explore! As a teaser, here is one interesting graph that shows the distribution of rental properties and Detroit-based homeowners by zip code:

Number of registered rentals per zip code, divided by number of Detroit-based homeowners.

Next Steps

Subscribe to my blog to view some exploratory data analysis on the final data frame, further questions that were kicked up by doing so, and the insights found in regard to my two hypotheses.

Rebecca Rosen is a recent graduate of Flatiron’s Data Science Immersive with a background in Cognitive Science and non-profit management. She currently resides in NYC and makes music in her spare time. To see more of her work, or to say hi, search Medium, Instagram or Facebook with the tag @rebeccahhrosen.

--

--

Rebecca Rosen

Graduate of Flatiron Schools Data Science Immersive currently living in New York City by way of Detroit, MI. Curious about systems, people & effective cohesion.