Analytics Vidhya
Published in

Analytics Vidhya

Exploring Residential Noise Complaint Trends in 311 Requests with Python

Photo by Ethan Bykerk on Unsplash

In this post I will use Python to explore the NYC 311 datasets. The dataset is available from NYC OpenData website, which contains all 311 requests from January 1, 2020 until present. Given the interest of this post I will look into all requests made before October 1, 2020. All datasets are fetched through API queries.

In the course of the exploration, I will ask a number of questions:

  1. What is the annual and monthly trends in the total number of claims in New York City?
  2. What are the significant types of claims?
  3. Are different types of claims correlated? Before 2020? During the Pandemic?
  4. How is some particular complaint type, such as Residential Noise spatially distributed?

I will close this post by recommending some further work with this dataset.

  1. What is the annual and monthly trends in the total number of claims in New York City?

Let’s start from counting the number of requests each year. Since 2012, the annual number of claims in New York City is steadily climbing, just after two years of decreasing since 2010. The number peaked in 2018, at 2,747,951 a year, and saw a decrease in 2019. Average daily requests in the last decade ranges from 4,921 in 2012 to 7,528 in 2018.

The monthly chart helps to break down the annual trend and recognize some periodicity. Monthly number of claims is, in general, climbing, but we saw that within each year, the number of requests in each month goes up and down seasonally. It is also apparent that in the second half of the 2019 number of 311 requests dropped significantly.

In general, New Yorkers are filing more and more claims over the last decade.

2. What are some significant types of claims over the last decade?

311 has taken more than 100 types of claims over the years, yet some of them are mutually inclusive and others at some point fell out of use. We define a claim type as significant if it has made to the top 5 most-filed-claims of any year in the past decade. Then we figure out their trends over the years. Some complaint types are duplicate of each other and we sum them up as one single type of complaints. For example Heat and HEAT/HOT WATER as heat_hot_water, Plaster - Paint and PLASTER/PAINT as plaster_paint.

Below is the trends of significant complaint types over the last decade:

We have a couple of findings:

  • Heat and/or Hot Water has been constantly big over the decade.
  • Residential Noise has climbed significantly.
  • Illegal Parking and Blocked Driveway seem to be increasing simultaneously.

Besides, General Construction gradually fell out in 2014; Plumbing goes down rather slowly; Street Condition has some up-and-downs; Request Large Bulky Item Collection comes out in 2017 at a little below 50,000, increased to more than 3 times in 2018 and fell back to around 100,000 in 2019.

3. Are different types of claims correlated? Before 2020? During the Pandemic?

First we want to finish the study of the complaints filed between January 1, 2010 and January 1, 2020. Again we need to select some significant types of claims and we want to rearrange the duplicated types. I also dropped a few complaint types which are not consistently used for classification. I end up with the correlation matrix visualization below:

From the chart we saw four groups of positively correlated complaint types:

  • Blocked Driveway and Illegal Parking: it seems obvious that Illegal Parking could induce Blocked Driveway.
  • Dying Tree andOvergrown Tree/Branches: perhaps their correlation might be explained by certain weather events.
  • ELECTRIC, paint_plaster and construction_plumbing: their correlation is not obvious to me.
  • Street Noise and Vehicle Noise.

Next we study the complaints filed in the year 2020, before October 1. Cleaning the dataset as we did in the last part, we can first get a glance of the significant types of the year 2020. Notice that Non-compliance with Phased Opening emerged as a major complaint since June.

A couple of interesting things:

  • In August we saw a lot of 311 complaints on Damaged Trees. Remember Tropical Storm Isaias hit the New York area on August 4. According to this NYTimes article, “more than two and a half million customers lost power and at least one person was killed.”
  • In May and June complaints about Illegal Fireworks soared. Illegal fireworks usually occur in late June, when July 4 is coming closer and closer. However, complaints on fireworks has come sooner, along side the unrest New Yorkers are experiencing. Here is another NYTimes article on the skyrocketing Illegal Fireworks this year.

We also get the following correlation matrix visualization:

Illegal Parking is still well correlated with Blocked Driveway, but there are also some altered patterns of correlations between compliant types:

  • Residential Noise, Street Noise and Vehicle Noise are more correlated.
  • Non-compliance with Phased Opening and Street Condition showed correlation.

4. How is some particular complaint type, say, Residential Noise, spatially distributed in the Pandemic time?

Before we look into the spatial distribution of Residential Noise, we might be interested in how its trend in 2020 compare to its historical records.

Complaints on Residential Noise skyrocketed since May, reaching a number almost twice the largest number of monthly complaints in 2019. In June and July the numbers dropped by a couple of thousands, whereas the number jumped to over 50,000 in August and stayed there in September. It is safe to conclude the 2020 trend is NOT part of the continuation of a growing trend since 2012.

Now we can check how the Residential Noise complaints have been spatially distributed in 2019 and 2020.

To create a choropleth of the number of complaints within each census tract, we first query all the requests in the Residential Noise category within the our specified time frame and compile them into a geopandas dataframe using the longitude and latitude columns. Then we performed a spatial join between the our geopandas dataframe and our New York City census tract shapefile, counting the number of points, i.e., instances of complaints which fall into each tract. The census tract information is available, too on the New York City OpenData website.

The classification scheme I selected to create the choropleth is boxplot. I picked this one because it could tell where outliers, i.e. the extremely high numbers of complaints, fall. The boxplot scheme divides the data into the following categories:

  • below minimum (outliers)
  • minimum to 25%
  • 25% to 50% (median)
  • 50%(median) to 75%
  • 75% to maximum
  • above maximum (outliers)

In our case, the first category catching values below the minimum line is meaningless, since there is no negative number of complaints within a tract. However, we want to pay attention to the last category, above maximum, which marks the tracts with extraordinary amount of claims in darkest green.

Below is the spatial distributions of Residential Noise claims in 2020:

In terms of the spatial patterns the two choropleths show, they are very similar. Tracts staying in one category in 2019 mostly stay in the same category in 2020, although the exact number demarcating each category has changed. The median number of annual complaints for New York City tracts is 65 in 2019, and it is 82 for the first ten months of 2020. This further indicates that the Residential Noise is getting worse in the pandemic times.

More analysis could be done with respect to this category of complaints:

  • Noise might not always be a result of bad neighbors; maybe the building is old. Investigate if there is any correlation between the spatial distribution of noise complaints and that of buildings of certain features.
  • Perhaps some neighborhoods are densely populated and thus produce more claims. Investigate the population density of some specific neighborhoods.
  • Beyond spatiality, what is the distribution of complaints within a day time?



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store