Saudi ArabiaTraffic Accidents Data Before Women Started To Drive

Sarah Aljudaibi
6 min readMar 21, 2020

--

You might wonder as I’m how the data’s looked like before women started to drive in 2018. So to answer this question I took 4 datasets form the previous two years 2016 and 2017 and use the Data Science technique on them.

As a newbie in Data Science I still has so much to learn, but as I started the course it changes my thought of things around me. Like how did they saw the trends in data?, where the data from?, If I used this method what will be the results?

so for my first project I will take a look on those Traffic Accidents and Driving Licenses numbers in Saudi Arabia. Searching for a trend and find the factors that influencing the outcome of traffic accidents.

The datasets I used:

  1. Driving Licenses dataset (1993–2017):

This dataset contains all the Licenses that issued in this range of years in a 12 regions in Saudi Arabia

2. Traffic Accidents dataset(2016–2017):

This dataset has the number of Accidents and it’s type based on the 12 regions in Saudi Arabia

3. Accidents Details 2016:

This dataset took a study in some regions (not all of 12 regions) to collect the accidents data and see how the accidents happened and what are the causes

4. Accidents Details 2017:

This dataset is the same as before.It took a study in some regions (not all of 12 regions) to collect the accidents data and see how the accidents happened and what are the causes

Let’s dive into the data

After I took some observations on the data I found all the dataFrames needs to be cleaned, so first step is:

Traffic Accidents dataset(2016–2017) dataset

Data cleaning

  1. get rid off the nulls values if exist
  2. change each column in the data-frame to its appropriate type, for example if the column contains numbers then it should be integer or float type.
  3. Drop unnecessary columns
  4. rename the columns name to more rememberable name

that what the data look like after some cleaning

Traffic Accident & Driving licenses after merging

Investigate trends in the data

first thing come to my mind is what is the regions has the maximum and lowest values in Licenses issued along with accidents. The results was not surprisingly Riyadh got the highest issued licenses through the years form 1993 to 2017 while Tabouk got the lowest issued through the same years

Licenses issued for the past five years

In Traffic Accidents Riyadh got the maximum number of accidents, but it decrease in 2017. While Makkah increase more until it become the highest regions with traffic accidents. The minimum region with accident was Albaha.

Traffic Accidens in (2016–2017)

From the trends I saw the regions that have more Traffic Accidents and Driving Licenses issued happened in each year than the other regions are :

  1. Riyadh Region

2. Makkah Region

3. Eastern Region

Visualize the data

From the scatter plot:

  • Makkah got around 137081 accidents in total while it got 136055 licenses issued that means the accident increase more than licenses with 0.74%
  • Eastern region got 102732 accidents in total while licenses are 142307
  • Riyadh got 143166 accidents in total while licenses are 242851

From the scatter plot:

  • Makkah got around 147182 accidents in total while it got 142487 licenses issued that means the accident increase more than licenses with 3%
  • Eastern region got 82396 accidents in total while licenses got is 126816
  • Riyadh got 101324 accident in total while licenses got is 495307

I can conclude from here Makkah region doesn't have a safety traffic system

the next datasets Accidents Details 2016 and also Accidents Details 2017 will help to see what most type of accidents happens and what are the reasons. both data set the type of accidents by accidents with only vehicles destroyed, accidents with injured people and accident with death.

In 2016 and 2017, most accident that happened was accident with only destroyed vehicles which was good until I saw the accidents that causes deaths. it even has a high numbers in Rabie the second,Ramadan and Shawal months. because in these months most people spending their time outside or go to their families for celebrating especially in ramadan and Shawal because it’s the Eid time. and because of that the streets are always crowded with cars.

Most accidents happened in Rabie the second,Ramadan 2016
Most accidents happened in Rabie the second,Ramadan and Shawal 2016

To see what the most type of accidents happened in the two years I plot it using line plot. from the image below you can see that crashed over other cars was number one accidents type with a huge different numbers from the other type

both 2016 and 2017 has crash type accidents

to know more about what cause these type of accidents plotting another columns from the data-frame. I found that all of the accident causes because of “other reason” but because it does not help me enough to know about the accidents I looked for the second accidents causes and it was the speed.

Both 2016 and 2017 have “other reason” and “speed exceeding” as the causes of accidents

After finishing these steps I become to know more about the history of the accidents in these two years.The most accidents happened are because of speed exceeding which can lead to crashing into other cars on the roads, and if you lucky enough nobody get hurts and the damages are only on the cars.

Most of these accidents happened are because the drivers are careless and act in irresponsible way. So I investigate more through the data-frame to see what are the drivers ages in most of the accidents. from the bar char below most accidents are happened when the drivers between 18 to 39. and in 2017 their were accidents happened with drivers are less than 18 which is very dangerous. people with younger age tend to love doing adventures and the feeling of the adrenalin when they goes into exciting situations, not caring about the consciousness.

Conclusion

The Traffic Accidnes & Driving Licenses both shows an intersting trends in thier data. To find more what the cause of the accidents for these regions and what are the connections to the Driving issued data. I used an outsider two data-Frame contains all type and cause of accidents happened in year(2016–2017) in general.

Using the data visualization I found most accident in 2016 and was in months the second Rabie, Ramadan and Shawal happened with kind of accidents with only vehicles destroyed. Then after it accidents that cause death.

The most accidents in 2016 was car crashing because of the speed exceeding. The highest accident becuse of speed was in Shawal, Ramadan and second Rabie. it was not suprised to see also in 2017 the most accidents type was car crashing because of speed and the highest month is Ramadan.

The final step was to find the age of the drivers Who were the cause of those accidents. in 2016 drivers range from 18 years to 29 only but in 2017 drivers become more reckless and the accidents cause from more variety driver the age range was from 30 to 49 and also people with age less than 18 started to cause more accidents

Since now women can drives in Saudi Arabia it will useful to use the new data of 2018 to 2020 in the future to investigate more and compare between the sex driving skills

--

--