Global Terrorism data analysis using pandas and plotly for interactive graphs

Carmem Stefanie
4 min readNov 27, 2020

--

Analysing a global terrorism database that contains data from 1970 to 2017 — almost five decades of attacks around the world — trying to generate a view of how the violence happens in a global way. Project made using pandas and interactive graphs generated using plotly. After the analisys, a Dash app was builded.

Developed in a partnership with Joao Vitor Dias Xavier.

Introduction

Terrorism, as defined by the Oxford University, is “the unlawful use of violence and intimidation, especially against civilians, in the pursuit of political aims.” It’s a non specific and subjective definition, which means it englobes many cases around the world. That said, databases were created in order to index the cases of terrorism and classify them.

Our objective is to show the material obtained from one of this databases in several different ways, using different graphs to help understand and visualize the data.

Development

The dataset we used is called Global Terrorism Database, and it’s available on Kaggle here. The data had, initially, 135 columns, having terrorism data from 1970 to 2017. Then in pre-processed stage we filter those columns to 16 (sixteen) columns, choosing only the ones we will use after. Having this 16 columns as a base, some question were open and graphs were ploted to brought a possible answer to each one.

First of all, we wanted to know how many cases of terrorist attacks happened around the world through the years. For this question we made an interactive and animated bubble plot sittued in a global map to show the results obtained. The size of each bubble indicates the amount of attacks that happened in that region.

Ok, so now we know how much cases happened each year, but how much cases occured in total, for each country? This demand took us to construct the following graph, still a bubble plot but in a orthographic view and having only one bubble for country. Like the previous one, the size of the bubble indicates the amount of attacks in that place.

We have already seen enough information about the number of cases, so the next step was to check the types of attacks. As mentioned in the introduction, the definition of terrorism is not limited to a single type of violence. In this way, we wanted to see how many cases of each type of attack occurred, and having a view of these values separated by region of the world.

Staying in the same subject, our objective now was to see how these terrorism distributed themselves over the time. Thus, we plot violin graphs, one for region, where the y axis represent the passage of years.

For close this analysis, the last question was kind of: knowing the distribution of attacks for year don’t show nothing to us about the type of attack that happened in each moment, so how to know if some type of attack was coming most present? Exist a pattern about this? For this questioning, both a scatter plot than a histogram were built — in both, the color represent the type of attack. This view help us to see if for some country, one type of terrorism is more common (like in South America, between years 1977 and 1997, were happened twenty cases of Hostage Taking Barricade Incident).

OBS: The visualization of this graph was affected by format of medium, for a better understanding consult it in its original form by clicking on “See full report”

The code produced can be accessed in github, here:

Results

Each graph bring us different perspectives of the terrorism around the world. For example, in last but one, we can see that the majority of terrorist attacks in Central America & Caribbean and South America had happened in the same period of time: from 1980 to early 90’s, but after that the reported cases have drastically lowered. We can infer the number of cases in those regions happened much more in this time because of the authoritarian governments around Latin America, which killed many people, persecuted many more, made hostages and other things like that.

In addition, is also possible to perceive that other regions, however, had a “peaceful” time in the 20th century but from 2010 to 2020 are having a bad time. South Asia, Middle East & North Africa, Sub-saharan Africa, Eastern Europe and Southeast Asia.

Other thing that we can do is use two or more graphs to create a more embased information. For instance, the third plot tell us that the attack type most common in Middle East & North Africa is Bombing/Explosion. Now, returning to the second plot, we can see this same region in a almost blue almost green color. Right there, is possible to find which country bring a bigger tax of attacks — is Iraq. Whit this two informations, we can suppouse that this reality is somehow related to Iraq War — an conflict that occured 2003 to 2011.

For last, not talking anymore about the graphs, is important to say that this project also included building a Dash app. For this, we used Plotly’s 4.12 version and Ngrok to host the website (since the code was made in Google Colaboratory, and Dash run in localhost). How was said previously, the code was acessible in github and the Dash is ready to be open.

References

GitHub: https://github.com/carmems/DataScience-GlobalTerrorism

Data: https://www.kaggle.com/START-UMD/gtd

Global Terrorism Informations: https://ourworldindata.org/terrorism

Iraq War Informations: https://www.britannica.com/event/Iraq-War

Contributions

Conceptualization, Carmem and João; Development, Carmem; Introduction, João; Methodology, Carmem and João; Results, Carmem and João;

--

--