Note: This blog uses a pre-existing set of data obtained from the Detroit’s Open Data Portal. Read more about the background of this project on my last blog post. Find details on the cleaning, hypothesis testings and outcomes on my github.

Image for post
Image for post
Photo by Ross Joyner on Unsplash

Main Objective: Analyze housing trends in Detroit, MI to better understand how individuals interact with the city based on their home’s rental status.

In particular, I was curious about 3 different things:

  1. Is the probability of a rental house having any blight citations equal to the probability of an owned home having any blight citations?
  2. Do rentals receive the same average number of blighting citations as non-rentals? …


I recently moved from Detroit, MI to Brooklyn, NY. Both of these cities are known for their cultural richness, as well as changes in population resulting in problematic community disruption and displacement. As an effort to better understand these trends, I decided to explore what sorts of variables could predict significant real estate trends.

Note : this post will cover the first 3 steps of my experimentation process. Subscribe to my stories to see more about the results and next steps!

Open Data Research

Starting off, I was curious about how a wide range of features effected residential housing in Detroit. I wanted to see if I could predict hosing vacancy using indicators such as registered rentals, reported major crimes, or tax foreclosures. All of this information I easily found on Detroit’s Open Data Portal, but the largest hurdle I faced with this was finding an appropriate join key — some tables included zip codes, some just had city name, and others only included census tract. This sort of variety of specifying information makes it very difficult to merge tables and integrate findings. …


This weekend I had the pleasure of participating in a “Love Ya Like A Sis” hackathon, as a part of NYC OpenData week. In a similar style to DataKind’s DataDeepDive weekends, we were given a team of diverse experts, a specific problem, and about 8 hours to solve it.

While our Github repo and more specific workflows will be uploaded soon, this article will be a short exploration of the data resources available to tackle the problem presented to my team.

The Challenge:

The NYC Commission on Gender Equity (NYC CGE) develops policy to combat the gender pay gap, including best practices for businesses in “Leveling the Paying Field.” …


During a side project, I found myself running into troubles with the pandas.DataFrame.merge function. I saw duplication of some rows and not others, and it seemed that the function was filling in multiple instances of the smaller data frame into rows of the other. I simply wanted to see NaN values in their place, so I took this as an opportunity to explore the merge function in greater depth.

Note: Pandas and Numpy are the only packages necessary to follow along with this article.

Image for post
Image for post
For this article, A will track with df1 and B will track with df2, as defined below.

Panda’s merge vs SQL’s join

Python is a bit trickier with merges as compared to SQL — the documentation in Pandas even refers to SQL joins to clarify its own merge function! The picture below refers to SQL joins, but is very useful to consider in this example as well.* …


The 2016 election took a lot of people by surprise. This is due in a large part to the miscalculations of voting predictions made around that time. Many would have reason to doubt the validity of polling at all, however surveying citizen is a useful tool in understanding voter tendencies and organizing a campaign around real needs. So how can this public skepticism be met with clear information to help citizens — and candidates — make more informed decisions about the political landscape?

In an effort to be more transparent about polling limitations, a major news outlet has decided to change the way that citizen surveys are done. Follow along below to learn about exactly how the New York Times partnered with the Siena College Research Institute to create a new polling strategy — one that that includes live updates on responses to upcoming elections, and may be more reliable than the others. …


China’s “Social Credit System” has been a topic of conversation in the West since 2015, when an ACLU article warned agains its potential dangers. However, the concept of a credit system clearly predates that — in the West and worldwide. What is the difference between different credit rewards systems and this one? As China approaches their desired implementation date of 2020, there’s good reason to believe that it will be coming up more and more in conversation.

Image for post
Image for post
Photo by Chastagner Thierry on Unsplash

First some linguistics -

China’s Social Credit System (SCS, 社会信用体系 or shehui xinyong tixi) can also be translated as “community faith system”. This translation is noteworthy because credence being placed on the credit aspect rather than the trust or faith interpretation. That the middle phrase is popularly translated as “credit” (信用 or xinyong), but can be also understood as “using truth” (xin means truth and yong “to use”). This perspective adds complexity to to a concept that currently has a solely financial interpretation. …


Missing data can skew findings, increase computational expense, and frustrate researchers. In recent years, dealing with missing data has become more prevalent in fields like biological and life sciences, as we are seeing very direct consequences of mismanaged null values¹. In response, there are more diverse methods for handling missing data emerging.

This is great for increasing the effectiveness of studies, and a bit tricky for aspiring and active data scientists keep up with. …


As a student of multiple interdisciplinary sciences, I am constantly thinking about ways to integrate.

Be it disparate ideas, distant friends, even conflicting perspectives - I love that we can use our mind to resolve conflicts or contradictions. So I am excited when I find opportunities to do so, especially in times when I think that I see the world so clearly, but it turns out to differ from another’s viewpoint.

Image for post
Image for post
Image from Propmodo

One of these ideas that seems really clear to me is role of slavery. I would think that societally, we can all agree that it is wrong — yes? However, there are currently more slaves in the world today than there have been in recorded history. Slavery can be defined as “being forced to work without pay, under threat of coercion or violence, for the purposes of exploitation or bondage, against one’s own will”. According to the International Labor Organization (ILO), 40.3 million people face slavery today, and of these victims, 25% of them are children and 75% are women or girls. …


In researching “Data for Good” practices throughout the world, you will find Cathy O’Neil’s book, Weapons of Math Destruction, come up more often than many other resources. In addition to the witty title, this is due to the relevance and streamlined articulations that the book provides.

Aside from offering a much needed overview of what concepts like algorithms and mathematical models actually are, the author weaves in anecdotes and case studies to communicate the impact on everyday individuals. This combination of explanation and story telling keeps a reader captivated through complex and controversial concepts.

Image for post
Image for post

After describing herself as a nose-down math nerd in elementary through academia, O’Neil explains her switch into the financial industry. She writes: “At first I was excited and amazed by working in this new laboratory, the global economy … [but] the crash made it all too clear that mathematics… was not only deeply entangled in the world’s problems, also fueling many of them.” …

About

Rebecca Rosen

Graduate of Flatiron Schools Data Science Immersive currently living in New York City by way of Detroit, MI. Curious about systems, people & effective cohesion.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store