Visualizing Forced Displacement Data from Somalia
What climate anomalies result in forced displacement and what are the needs of people affected by it?
Understanding data through visualizations and deriving support action for communities in Somalia most in need to help.
Two months of collaboration
I have been collaborating with an international team of data scientists and machine learning engineers in Omdena’s AI challenge to determine if climate anomalies (drought, rainfall, etc.) are related to conflicts or displacements.
We worked closely with the UN Refugee Agency (UNHCR) to collect data, complete data analysis, and create predictive models. The region of interest for the challenge was Somalia.
One of the tasks I worked on was doing exploratory data analysis on forced displacement data. I took advantage of a data visualization toolkit to create some visuals and gain more insights into the data.
The Data
The dataset for this specific task was collected and curated by UNHCR and contained useful information about displacements that occurred in Somalia from January 2016 to July 2019.
The dataset included information about how many people were affected by a certain displacement, the reason behind this displacement, the departure and arrival region, and what the priority need was on arrival for the displaced.
Questions to answer and creating visualizations
The data visualization toolkit I utilized is called Plotly which is very useful for creating interactive visualizations and sharing them with other people. I planned to use these visuals to answer some questions I had about the data.
Question 1: What are the reasons for displacements and what is the most common one?
Looking at the plot above, I saw that there were a total of four possible reasons for displacements: drought, conflict/insecurity, flood, and other. The most common reason for displacement was drought-related. This is an important data point since the challenge was to discover the relationship between climate change and displacements.
Question 2: How many regions are there and how many displacements occur in each region?
From the above visualization, I was able to see all the regions in Somalia and the number of recorded displacements for each region. The numbers were very high for Bari, Bay, and Banadir.
When a displacement occurred, there was a priority need for the displaced on their arrival to a new region. For example, one of the entries in the dataset is a displacement event that occurred on January 31st, 2016. 23 people were displaced to a new region where their first need on arrival was water. In the dataset, this need is called priority need and all displacement events have a priority need associated with them. The priority need column was interesting to me because it could be utilized not only to see the distribution of possible priority needs, but also the priority needs separated by regions.
Further questions
How many different priority needs are there?
For each specific region, what is the most common priority need?
Which priority need affects most individuals?
Which priority needs are most common for each month?
This last plot was notably less useful because of how it displays consistently high frequencies of food as a priority need. However, it is still helpful because it shows how the months of March and July each have a high number of displacement events.
Conclusion and what I learned
Visualizing data is a crucial part of any data science project and for this challenge, it was an important first step to understanding some of the data we had before building predictive models. It also gave us a chance to see the scope of the problem in Somalia, especially looking at how many people were displaced in a three-year time frame.
Thanks to the collaborative nature of this challenge, other group members did their own analyses and visualizations on the same dataset and we were able to pool our findings together and learn new things from each other’s work. Some interesting explorations performed by other team members included looking at displacement flow between regions and districts, displacements on a seasonal basis, comparing reasons for displacement events, and more.
This was a large project with many possible data sources including satellite imagery, tabular data, vegetation data, to name a few. Being able to be a part of such a large and talented team and to contribute to the amazing work done by UNHCR was a great and rewarding experience.
I hope others see the impact that a collaborative community like this can have and will consider joining upcoming challenges.