Visualizing Gun Violence Trends in the US

Shravya Simha
VisUMD
Published in
7 min readDec 12, 2019

How data can offer insight into a complex societal issue.

“New Orleans shooting: Two people in critical condition after 11 shot in French Quarter.”

“At least 10 people killed as 52 shot in Chicago.”

“Gunmen on motorbikes kill 41 people.”

Such headlines have become so commonplace that we are programmed to not so much as flinch when we come across them. The effectiveness and safety of guns are a hot topic of debate in the United States. In this project, we set out to understand the nature of incidents in a geographical and temporal context. We are mainly interested in exploring the prevalence of gun violence with reference to gun ownership, and how factors like socioeconomic status play a role in gun crimes.

Every year, tens of thousands of deaths and injuries occur due to gun crimes in the United States. Gun violence is an urgent, complex, and multifaceted problem. Many people are frustrated with the role guns currently have in American society. Although a lot of research is being done to understand the root cause of the problem, so far there has been no significant progress in attempts to curb it. The rate of gun violence in the US substantially exceeds that of most other developed nations. This means there is a need for new gun laws and policies.

In this context, we hoped to learn the following:

  • Nature of incidents in a geographical and temporal context
  • Correlation of gun ownership and gun violence
  • Understand the prevalence of gun violence with reference to factors like socioeconomic status

Let’s first look at some statistics:

Design Process

We consciously tried to follow basic design principles while creating the visualizations:

  • Balance: In our visualizations, we have tried to achieve asymmetrical balance — different parts of the design are unique but carry a similar visual weight. “Heavier” elements jump out, while the “lighter” ones recede.
  • Emphasis: In all our visualizations, we have emphasized important data by drawing the user’s attention to it using colors, size, negative space, or contrast. The goal of this design principle is to ensure that users see the most important data first.
  • Proportion: Proportions in our visualizations indicate the weight of different data sets and the relationship between their values. Important data is displayed relatively bigger in proportion to data that is significantly less important.
  • Variety: We have tried to achieve variety in visualizations by diverse use of color, shape, and chart-type. In our opinion, this is what keeps users interested and engaged with the data.

We have leveraged the ample amount of data available to us in an effective way by correlating the various parts of the dataset that could affect each other. We have used colors, shapes, and textures to our advantage while creating visualizations for better readability. We have tried to use various types of graphs and charts to visualize data. For example, bar charts, scatter plots, USA map, word cloud, pie chart, and line graphs.

Most of our visualizations are self-explanatory, there is no need to have any prior knowledge about visualization guidelines to understand the graphs.

We had initial visions about what we wanted to find out from our dataset. Once we got our dataset, we started playing around with the data to find patterns and trends in it. We started sketching basic outlines of our visualizations before we used a tool to do it. This gave us a place to start and also helped us understand how to proceed.

Dataset

We initially started working with the gun violence dataset from 2013 to 2018. This dataset had a plethora of data — a total of 26 columns and 239677 rows! Although we could derive great insights from this dataset, we needed more data to help us find more profound relationships and trends. With this in mind, we further used gun ownership data and US Census data along with the existing gun violence data.

Data Cleaning

It is very challenging to make sense of the data if there are multiple values present in one cell. So, we parsed such columns with multiple values, like gender, participant age group, participant age, participant gender, and participant status. These values were stored with special characters like pipes and double colons. We cleaned these data in order to parse the values into single values that are easy to handle. We further split the cleaned data into new meaningful columns. Python scripting was used for cleaning the data.

We then took care of unknown values in each column and considered only those columns which had enough data to get any insight. The datatype of some columns needed transformation, for example, the number of guns involved was a floating-point value. The number of guns involved could never be a float value; so, we transformed that into an integer column, as the number of guns involved would always be a whole number.

Similarly, the date column was transformed into the year, month, weekday columns further to find changes in the pattern over the years, month and weekdays.

The gender column was in the 0::Male||1::Male||3:Male||4::Female format. We formed two different columns named participant gender male, participant gender female and calculated the number of males and females involved in each incident separately. We handled other columns in a similar way.

Main Findings

  1. Our dataset had data about gun crimes from 2013 to 2018. In this dataset, the maximum number of incidents took place in 2017.
  1. There is a direct correlation between gun ownership and gun violence. Higher gun ownership -> more gun violence

We are both from India. Although gun crimes are not unheard of, with a country that has a population as large as 7 million, we surely hear of gun crimes far less frequently than the United States. Procuring a gun in India is certainly more difficult due to the stringent laws surrounding gun ownership and the hassle of obtaining a license. However, this is not the case in the US. It is relatively easy to obtain a gun here, especially in the name of self-defense. We wanted to understand if the extremely high number of guns per capita contributes to the high number of gun crimes. Interestingly, we found that, although there is no direct mapping among states, it is evident that most states with high gun ownership have a higher number of gun crime incidents. (Example: Washington DC, Louisiana, Illinois)

3. Lower income -> more gun violence

Similar to gun ownership, we can see from the visualizations that, although there is no 1:1 mapping between the states and the income per capita, most states with less income have a higher percentage of gun crimes.

4. We wanted to find out the top 3 cities with the highest number of gun crimes, for which we plotted the number of incidents against the cities and filtered the results on the number of incidents. We found that Chicago, Baltimore, and Washington are the top 3 cities with the highest recorded incidents.

5. More gun crimes are recorded during the weekends.

We tried finding correlations between the day of the week and the number of incidents. It so happens that, more number of gun crimes have been reported during the weekends. A reason for this could be because of the higher percentage of people that visit malls and grocery shops during the weekend.

6. More men than women are being killed by gun crimes

This is an interesting finding. A relatively higher percentage of men are victims of gun violence as opposed to women.

7. Target age group is mostly adults

It might come as no shock that more adults are being targeted in these incidents.

8. More incidents in July, January and March

Another interesting finding was that the number of incidents is taking place in July is significantly higher than compared to all other months. Our guess is that the 4th of July may have some correlation with this finding.

You can view the demo of our visualization here:

--

--