Visualization of Air Pollution (Using Folium)
Air pollution has been a major problem at this globalisation stage. Different countries are pushing hard to pump growth into their economy without sustainable approach. Similarly it can be seen in one of the city of Korea called Seoul. Seoul Metropolitan Government helped to collect different measures of air particulates like NO2 (Nitrogen dioxide), SO2 (Sulphur dioxide), CO (Carbon Monoxide), PM2.5 and PM10 (Particulate Matter).
The motive of the article is to provide exposure over different visualisation tools and further to take inference out of it. Let’s start with data exploration of the dateset provided by Seoul Metropolitan Government to Kaggle as public dataset. To download the dataset, click here.
As usual, we start reading the csv file through pandas.
During the exploration we found that, there are few values are -1, which could be reason of faulty appratus which take those air pollutants reading. Hence we decided to impute those value with mean values. This can be done with easily with the help of scikit library “SimpleImputer”.
We plotted the line graph for the different emission keeping x-axis as the date and y-axis as the emission type (NO2, SO2 etc). Seaborn is the great library to plot these graphs.
We also plotted the correlation matrix, to check which gases are dependent on others. It was observed that, the SO2, NO2, and O3 are highly correlated. It can inferenced that, these gases help each in the increase and decrease of individual other gases.
In time series data analysis which involves latitude and longitude, it always good view the data in a map. This functionality can be explored with the help of great library called “Folium”. It is higly intituitive to use and it is open source. To start with, there are some pre-processing is required with respect to the dataset.
We have to create few column as hour, days, weeks, months and years which will help us to plot the lattitude and longititude on to the maps.
Next we create function with some default parameters:
- default_location — It takes latitude and longitude as parameter which will be display in the map.
- control_scale — It enables or disables the map at given zoom level.
- zoom_start — It specifies how much zoom is required as map loads.
Now we need create a list with all latitude, longitude, PM2.5 (can be any gas here) and year. Then we pass all these information into the generateBaseMap function which will plot the latitude and longitude on the map. Below is the code for reference:
Here we can infer that, the pollution is on continuous increase since 2017. Also the change in color towards red depicts that, “PM2.5” values are reaching more towards at “very bad” severity condition.
We also plotted month wise for year 2017 in order to look, how pollution changes in a year.
Again it follows the same pattern with continuous increase in each month.
Conclusion
From the correlation matrix it was observed that, there is strong relationship between SO2, NO2 and O3. It will be fun to checkout the further relationship among these gases. Also from further reading (source), the gases released from different factories converted into Ozone (O3) in presence of sunlight. From the lineplot above, the there is always increase in O3 in summer than winter. It makes it obvious again, the presence of sunlight causing increase in O3.
Hope you liked the blog. Please do clap, share and comment! Stay tuned for my next blog.