ANALYSIS ON EMERGENCY — 911 CALLS WITH PYTHON

Jonah Usanga
CodeX
Published in
7 min readOct 29, 2022

ABOUT PROJECT
Like we all know that emergency is an unforeseen or sudden occurrence which demand immediate remedy or action. At some point, we tend to think more about why some emergency calls are been made and that’s the main reason I have decided to perform some analysis on the 911 calls dataset.

GETTING DATA
For this project I will be analysing some 911 (emergency calls) data from Kaggle in Montgomert County Pennsylvania. The data contains the following fields:

• lat : String variable, Latitude

• lng: String variable, Longitude • desc: String variable, Description of the Emergency Call

• zip: String variable, Zipcode

• title: String variable, Title

• timeStamp: String variable, YYYY-MM-DD HH:MM:SS

• twp: String variable, Township

• addr: String variable, Address

• e: String variable, Dummy variable (always 1)

PYTHON LIBRARIES IMPORTATIONS
A library which is a collection of functions that we include in our python code and called as necessary. With libraries, pre-existing functions can be imported which will efficiently expand the code performance. For this project, I will import the following libraries pandas, numpy, matplotlib, seaborn, sklearn e.t.c. Then set %matplotlib inline since I’m using a Jupiter notebook

PROCEDURES
A lot of procedures was taken, I will be discussing them one after the other.

1. Reading-in-the 911 calls datasets as a dataframe called “df”
There are several methods to read in files. In this project I used pandas library. It allows you to read files with several delimiters and also use the info() method which returns the basic information about the Dataframe.

2. Checking the head of the dataset
The head() function is a function in pandas Library. It is used to get the top n-number of rows of the dataframe. It is useful for testing if the object has the right type of data in it.

ANSWERING SOME BASIC QUESTIONS

1. Top 5 zipcode for 911 calls

2. Top 5 Township (Twp) for 911 calls

CREATING NEW FEATURES
In the titles column there are “Reasons/Departments” specified before the title code. These are EMS, Fire, and Traffic. I’ll use .apply() with a custom lambda expression to create a new column called “Reason” that contains this string value.
For example, if the title column value is EMS: BACK PAINS/INJURY , the Reason column value would be EMS.

3. What is the most common Reason for a 911 call based off of this new column?

4. Now use seaborn to create a countplot of 911 calls by Reason.

5. I am going to look at the time of which the 911 calls took place but first, let us change the data type from the timeStamp column from string to daytime.

6. Now that the timestamp column are actually DateTime objects, use .apply() to create 3 new columns called Hour, Month, and Day of Week. You will create these columns based off of the timeStamp column.

Notice how the Day of Week is an integer 0–6. Use the .map() with this dictionary to map the actual string names to the day of the week:
dmap = {0:’Mon’,1:’Tue’,2:’Wed’,3:’Thu’,4:’Fri’,5:’Sat’,6:’Sun’}

7. I used seaborn to create a countplot of the Day of Week column with the hue based off of the Reason column.

8. I used seaborn to create a countplot of the Month column with the hue based off of the Reason column.

9. Did you notice something strange about the Plot?
You should have noticed it was missing some Months, let’s see if we can maybe fill in this information by plotting the information in another way, possibly a simple line plot that fills in the missing months, in order to do this, we’ll need to do some work with pandas.
Now create a gropuby object called byMonth, where you group the DataFrame by the month column and use the count() method for aggregation. Use the head() method on this returned DataFrame.

10. Now create a simple plot off of the dataframe indicating the count of calls per month. Could be any column.

11. I used seaborn’s lmplot() to create a linear fit on the number of calls per month. Keep in mind you may need to reset the index to a column.

12. Create a new column called ‘Date’ that contains the date from the timeStamp column. You’ll need to use apply along with the .date() method.

Now groupby this Date column with the count() aggregate and create a plot of counts of 911 calls.

13. Now I will recreate the above but 3 separate plots where each plot will be representing a Reason for the 911 call.

Now let’s move on to creating heatmaps with seaborn and our data. We’ll first need to restructure the dataframe so that the columns become the Hours and the Index becomes the Day of the Week . There are lots of ways to do this, but I would recommend trying to combine groupby with an unstack method.

14. I then created a HeatMap using this new DataFrame.

15. I also repeated the above plots and operations, for a DataFrame that shows the Month as the column.

CONCLUSION

From the analysis so far, I have been able to discover some insight which includes:
• Lower Merion is the township with most 911 calls of about 8443 calls

• EMS is the common reason for 911 calls followed by Traffic reason

• During days of the week, it is noticed that Sunday has less cases of 911 calls for traffic reason while Tuesday has the highest cases of 911 calls for same reason. In all the days of the week the most common reason for 911 calls is EMS while the least is Fire reason.

• For Month with regards to Traffic reason, January has the most 911 calls while December has the least.

• For Month with regards to Fire reason, July has the most 911 calls while December has the least.

• For Month with regards to EMS reason, January has the most 911 calls while December has the least.

• From the analysis conducted on the data, December has the least 911 calls for all the reasons. It therefore means that there’s a drop of 911 calls coming in for any reason at the end of the year.

Thanks for going through my project, your observations and suggestions are highly appreciated, you can leave your inputs in the comment session, email me directly [here] or reach me through other of my social media platforms below.

LinkedIn

Twitter

Github

Dataset

Full code

--

--

Jonah Usanga
CodeX
Writer for

An experienced data analyst/scientist, expertise in solving problems related to machine learning, interpreting and analyzing data to drive successful business.