Visualizing the state of Idamalayar and Idukki reservoirs during Kerala floods of 2018.
This post describes the concept, process, insights and extensions of an exploratory data visualization project done as a part of Interactive Data Viz course at IDC. The domain of exploration was supposed to be ‘Kerala floods’.
You can view the interactive viz on my website.
Beginning in July 2018 and lasting till mid August, severe floods affected the south Indian state of Kerala, due to unusually high rainfall during the monsoon season. It was the worst flooding in Kerala in nearly a century.
The exploration
To begin with I first tried to fathom the situation during the floods from various perspectives. To name a few — rainfall, relief camps, finding the causes for the floods, effect of quarrying & depletion of wetlands. While exploring for various data. While exploring I read through various online articles, blogs and also watched videos of few news channels. I also came across various reports posted by government. While going through all of this I was making my personal notes parallelly. In this process I ended up with a summarized view of happenings over the period of June, July and August (2018).
The data
To get a better insight, I now started trying to figure out if I could find suitable in-depth data for the key events which I logged from various resources. I looked over the internet and eventually I moved onto e-newspapers (Indian Express, Thiruvananthampuram).
Few other sources of data for me were:
Kerala State Electricity Board (KSEB)— They have the system statistics of 16 reservoirs in Kerala on daily basis.
Indian meteorological Dept. — They have the data on daily rains received but in a weird format which was difficult to extract, eventually dropped out due to lack of time.
After taking out snippets from the newspaper I documented all the data and tried classifying it into various categories. While doing this I realized the most extensive data I had was related to the dam. Although the data which I managed to get from the newspapers was not extensive enough for all the dams. Although this activity gave me a vague idea about the happenings during those two months.
On further curation of data by KSEB I found out that it was good enough to tell me what was the inflow and outflow of individual dams on every single day(Although few data points were missing). This helped me cross verify with the dates of shutter opening of various dams which I managed to get from the newspapers.
The concept
I started of with some data defining questions like -
1. What interlinks/ correlations do I see with data I have?
2. What could be a possible narrative to form?
3. Can I create a mashup with some other set of data to find some interesting insights?
4. What was the richest data I posses and what could be focused on?
After curation and collection of sufficient data I chose to try and form close links between the various political and environmental conditions which led to the worsening of the flood conditions.
The key interpretations and ideas which I wanted to visualize were as follows:
1. The cause of spilling of multiple dams. (Excessive rains owing to two LP belts formed over northern Bay of Bengal on 7th and 14th August,2018)
2. The accusation of opposition party over the delay in opening of Idukki dam and also non adherence to test run schedule.
3. Whether water release from Mullaperiyar dam, managed by Tamil Nadu govt. aggravated the situation in Kerala.
The final curation of data
After copy pasting all the required data from the KSEB website. It was time to parse it into components which were of my interest.
I used MS-Excel and the function of VLookup came in very handy for cleaning up the data. I could now also spot the missing data points quite easily and I tried to look for them from other data sources.
The data of inflow of water was of my interest but that was not directly available in terms of the volume of water. It was available in the form of the MU generation capacity of the reservoir. The KSEB had mentioned the generating capacity and the number of machines each reservoir had which could help in tracing back the volume of water required to actually match the generation capacity. This seemed to be a tricky task and with quite high possibility of missing out on certain variables which might also affect the results. On further closer study of the data I found out that the inflow of water (in mcm-million cubic meters) could be extrapolated from Stagnant water capacity (in MU), effective storage and gross generation capability because they bear a linear relation. Using some quick short-tricks in Excel and I managed to get the inflow in mcm.
The Visualization
After deciding upon the focus now came the time of coming up with ideas to visualize. First I thought of creating some sort of animation to show the quantity of flow of water and to form some links between the various dams. This was inspired from online manorama. So to better understand I started studying the connections between the dams and found out that Idukki and Idamalayar were closely related and Idukki being the most prominent dam in the entire disaster I chose to focus solely on them. As well as it linked with Mullaperiyar dam as well which was one of my prime focus of visualizations that I had narrowed down upon.
Initial ideas
Final prototyping
After thinking of positives and negatives of the various ideas I thought that estimation of the volume of water was of the most importance and also considering the principles of effectiveness, a bar graph would possibly suit the purpose most!
So now after deciding upon bar graphs I looked into various tools which could come in handy for early mock-ups. After fixing my data in excel, I resorted onto Flourish.
Moving on I thought Flourish would not give me the liberty of enough interactivity so I chose to pick up the chart.js, a simple and flexible JavaScript charting tool to make my webpage.
Some challenges during prototyping with chart.js were:
1. Chart.js does not directly support the .CSV files, so I had to figure out a work around.
Below are the steps on how to pull the data from .CSV-
Create a Chart. Build the initial scaffolding with chart.js.
Build a Data Feed. Use the Flex.io API to access the remote CSV file.
Convert and Format the Data. Use the Flex.io to convert the file to JSON and Lodash to format as required by Chart.js.
Pull it all together. Combine the front-end component with the data feed to display the live bar chart.
Reference code can be found on my Git or you may simply follow this tutorial.
2. The tool tips are a bit rigid, still need to figure out the way to manipulate them.
3. Figuring out way to plot multiple graphs with separate scales.
4. On interacting with graphs (toggling various variables) the scale changes, still need to figure out how to keep it constant.
Thanks for stopping by!
Constructive criticism is appreciated. :)
Here are the important links:
Link to the viz —
https://www.parthkapadia.in/KeralaDataViz/
Link to my curated data and paper clippings-
https://github.com/kapadiaparth/KeralaDataViz/tree/gh-pages