Death, Violence, and Disaster

Police Violence Thinking Man by Engin Akyurt (pixabay.com)

Nobody knows who’s killing whom. Even the police don’t know. Everyone is vulnerable and no part of the community is immune.
Faed Bakr

In a world saturated by violence, a disturbing trend of police-targeted and police-perpetrated violence has been on the rise. Or so the media would have us believe. But does this conclusion stand up to the data?

I very recently watched Whose Streets? (2017), a Documentary film exploring the Ferguson Uprising as told by the people who lived through it. The film tackles questions of police violence and its effects on a community and it people. As well as the ways in which these people fought back against their oppression at the hands of law enforcement.

While the movement itself preaches a message of nonviolent, protests can — and do— at times devolve into riots, violence, and looting. This potential for violence has led me to question the underlying influences and motivators for police violence.

Although I have no intention of rationalizing or defending police violence, I think it is important to recognize the potential factors behind its apparent rise. While race and socioeconomic factors are the most widely recognized influences on police violence, the issue is far more complex, and I am interested in looking at the issue through a different lens.

In an effort to explore this issue, I’ve drawn upon my background as an Information Scientist to leverage data to explore potential contributing factors to police violence.

The Process

I initially began my analysis by searching for data sets that I could easily work with. I searched various online repositories like Kaggle, Data.gov, and GitHub; until I found three data sets that I liked. The three data sets which I chose to base my analysis upon are: FiveThirtyEight’s Police Deaths Dataset, FiveThirtyEight’s Police Killings Dataset, and FEMA’s Disaster Declarations Summary Dataset. My reasoning behind choosing these three data sets is the fact that they can provide me with a nuanced analysis of the issue of police violence.

There are few reports linking police violence and police deaths, and no reports linking national natural disasters to police violence. Using this new approach, I formed hypotheses for my analysis:

  1. There is a positive correlation between police violence and police officer deaths.
  2. There is a negative correlation between police violence and national natural disasters.

Once I had found my data sets, I went through the process of cleaning my data in order to make my analysis easier and my visualizations less tedious to make.

The process I underwent to clean my data involved loading my data into Jupytr Notebook and manipulating it through the python scripting language; primarily through the use of two python libraries: pandas and datetime.

Note: dataframe is a class of objects under the pandas library.

Above is the code I used to import the pandas and datetime libraries into my notebook.

After importing my libraries into my notebook, I used pandas.read_csv() and pandas.read_excel to read in my data sets as pandas data frames.

Above is the code I used to import my 1st data frame, which is in the form of a .csv file.
Above is the code I used to import my 2nd data frame, which is in the form of a .csv file.
Above is the code I used to import my 3rd data frame, which is in the form of a .xlsx file.

Next I started the lengthy process of cleaning my data sets.

Cleaning Data Set #1
The main issues with my 1st data set were that the data set’s columns had an unhelpful naming convention, the data set contained columns that I had no use for, and the data set’s date values were formatted incorrectly.

To rectify the first two issues, I used dataframe.columns() to rename the columns and I used dataframe.drop() to to remove the columns which I had no use for.

Next, to solve the third issue, I used dataframe.iterrows() to iterate through the data set focusing on the “Date_of_Death” column. Whilst iterating through each row in the data set, I used datetime.strptime() to convert the date values — which are strings — to datatime values, followed by datetime.strftime() to format those datetime values and turn them back into strings. I was able to do this in line, for each row, by changing its value using pandas.at().

Cleaning Data Set #2
The main issues with my 2nd data set were that the data set’s columns had an unhelpful naming convention, the data set contained columns that I have no use for, and the data set’s date values were split across three columns.

To rectify the first two issues, I used pandas.columns() to rename the columns and I used dataframe.drop() to to remove the columns which I had no use for.

Next, to solve the third issue, I first created a list that I used to create a new column that combined the data set’s month, day, and year values into single date values.

I then used dataframe.iterrows() to iterate through the data set. Whilst iterating through each row in the data set, I used string concatenation to combine the month, day, and year values into single single date value which I populated the list with using list.append().

After populating the list, I used dataframe.insert() to add it to the data set as a new column titled “Date_of_Death”.

Next, I once more used dataframe.iterrows() to iterate through the data set, this time focusing on the ‘Date_of_Death’ column. Whilst iterating through each row in the data set, I used datetime.strptime() to convert the date values — which are strings — to datatime values, followed by datetime.strftime() to format those datetime values and turn them back into strings. I was able to do this in line, for each row, by changing its value using pandas.at().

Now that I had created my “Date_of_Death” column, I no longer need my month and day values so I removed them from my data set using dataframe.drop().

Cleaning Data Set #3
The main issues with my final data set were that the data set’s columns had an unhelpful naming convention, the data set data set’s date values were formatted incorrectly, and the data set was missing year value columns.

To rectify the first issue, I used pandas.columns() to rename the columns.

Next, to solve that last two issues, I first created two list that I used to create two new column: one with properly formatted date values and another with only year values.

I then used dataframe.iterrows() to iterate through the data set. Whilst iterating through each row in the data set, I used datetime.strptime() to convert the date values — which I made into strings — to datatime values, followed by datetime.strftime(), which I used twice; once to format those datetime values and turn them back into strings, and twice to extract year values from the dates.

I used these date and year values to populate the two lists with list.append().

I repeated these steps three times.
The first time for “Decleration_Datetime”.

The second time for “Disaster_Start_Date”.

And finally, the third time for “Disaster_End_Date.”

After populating each of the lists, I used dataframe.insert() to add them to the data set as a new columns, titled:
“Deceleration_Date”, “Decleration_Year...

…“Disaster_Start_Date”, “Disaster_Start_Year…

…“Disaster_End_Date”, and “Disaster_End _Year.

Exporting the Cleaned Data Sets
After cleaning my data sets, I used dataframe.to_excel() to export the data sets as Microsoft Excel files (.xlsx).

Above is the code I used to export my 1st cleaned data frame as a .xlsx file.
Above is the code I used to export my 2nd cleaned data frame as a .xlsx file.
Above is the code I used to export my 3rd cleaned data frame as a .xlsx file.

After exporting my cleaned data sets I used Tableau to create visualizations from them.

The Results

Hypothesis: There is a positive correlation between police violence and police officer deaths.

The number of police officer deaths in 2015 across various states.
The number of deaths caused by police officers in 2015 across various states.
Police Deaths by State (2015) [Blue] + Police Killings by State (2015) [Red]

Takeaways:

  1. There is a vastly larger number of people killed by the police in 2015 than there are of police officers dying in the line of duty in 2015. This is to be expected considering police training and law enforcement’s access to equipment.
  2. The five states with the highest amount of police officer deaths in 2015 are: Texas (12), New York (11), Georgia (8) and Louisiana (8), and California (6).
  3. The five states with the highest amount of people killed by police in 2015 are: California (74), Texas (43), Florida (29), Arizona (25), and Oklahoma (22).
  4. There appears to be no correlation between the number of police deaths and the number of people killed by police.

Whilst there may be no correlation, perhaps we can establish a bit of causation.

The number of police officer deaths from 2014 to 2015 across various states.

Takeaway:
While three years is not sufficient to establish causation, we can see that police officer deaths are on the decline which could indicate that police violence deters violence against the police.

Interpretation:
My hypothesis was incorrect. There seems to be no correlation (positive or negative) between police violence and police officer deaths. The number of people killed by police seems to be independent the the number of police deaths (at least in 2015).

Hypothesis: There is a negative correlation between police violence and national natural disasters.

Police officer deaths from 2014 to 2016 across various states.
National natural disasters (1953–2019) vs. police officer deaths (2015) across various states.
National natural disasters (1953–2019) vs. deaths caused by police officers (2015) across various states.

Takeaways:

  1. The amount of police officer deaths is on the decline from 2014 to 2016.
  2. There seems to be a weak correlation between national natural disasters and police deaths/police killings insofar as states with high amounts of national natural disasters tend to have higher counts of police officer deaths and people killed by police in 2015.

Takeaways:

  1. States like Texas, Florida, and New York have higher concentrations of police officer deaths (2015), people killed by police (2015), and national natural disasters (1953–2019).
  2. Fire seems to have the closest correlation to police deaths and police killings as states like Texas, New York, Florida, and Colorado all have high concentrations of fire-related national natural disasters (1953–2019) and relatively large amounts of police officer deaths (2015) and people killed by police (2015).
  3. California is one of the states that call my hypothesis into question. It has a large concentration of police officer deaths (2015) and people killed by police (2015), yet a surprisingly low concentration of national natural disasters (1953–2019).

Interpretation:
My hypothesis was incorrect. There seems to be little to no correlation (positive or negative) between police violence and national natural disasters. Any connection between national natural disasters and the number of people killed by police or the number of police deaths seems coincidental at best.

In Conclusion

My initial findings suggest that police officer deaths and national natural disasters have little to no bearing on police violence. While this may be the case, it could also be the result of limitations in the scope of my data sets. For instance, the FiveThirtyEight’s Police Killings Dataset only contained data points from 2015 which really restricted my ability to evaluate trends in police killings over time. Perhaps with a larger amount of data a stronger analysis can be formed, but for the time being we should continue to keep a critical — if skeptical — eye on the ways in which police officer deaths and national natural disasters could affect police violence.

The virtues of science are skepticism and independence of thought.
Walter Gilbert

Bonus Visualizations: A Breakdown of Police Deaths and Killings by Population

The cause of police officer deaths from 2014 to 2016 across various states.
A breakdown of people killed by law enforcement: Cause of Death
A breakdown of people killed by law enforcement: Gender
A breakdown of people killed by law enforcement: Race/Ethnicity
A breakdown of people killed by law enforcement: Were they armed?

--

--