Q#45: NYC Complaints

Published in

Foundational Data Science: Interview Questions

2 min readAug 11, 2021

Suppose you have the following dataset, which shows a subset of 311 service requests from NYC Open Data. Given the dataset, plot the most common complaint types across all boroughs.

TRY IT YOURSELF

ANSWER

This question tests your data science skills to quickly gather information from a large dataset in the form of a visual. To do this, we will use Python, specifically the Pandas library to gather the data, get a count of what we want and return a plot.

First, we can load the data using the Pandas library into a dataframe structure with pd.read_csv(<link>). Next, we can chain functions on our stored dataframe to obtain the counts of complaints across every borough, pick out the top 10, and plot it. To get the counts of complaints, we first index the column with df[‘Complaint Type’] then use the .value_counts() functions followed by .iloc[:10] to get just the top 10 complaints. Finally, we use the .plot() with the kind = ‘bar’ argument to plot our bar plot of the results.

import pandas as pddf = pd.read_csv('https://raw.githubusercontent.com/erood/interviewqs.com_code_snippets/master/Datasets/311-service-requests.csv')df['Complaint Type'].value_counts().iloc[:10].plot(kind = 'bar', title = 'Top Complaints Across Boroughs');

Q#45: NYC Complaints

TRY IT YOURSELF

ANSWER

Written by Abish Pius