Covid-19: Genetic Mutation or Engineering

Motion Chart Visualisation in Python

Maziar Izadi
Analytics Vidhya
4 min readApr 18, 2020

--

Virus will leave an economic impact for decades.

Using the long-term interest rate data and other sources to look at the economic impact of 12 pandemics in history, researchers found “significant macroeconomic after-effects” from pandemics that lasted for 40 years.

In this article, I let data talk for itself.

A visual motion chart which illustrates the trend of outbreak in different countries from 3 perspectives has a lot to say…

  1. Number of Confirmed cases
  2. Number of Recovered cases
  3. Number of deceased

Data Source

I forked data from data source operated by the Johns Hopkins University Center for Systems Science and Engineering (JHU CSSE). Also, Supported by ESRI Living Atlas Team and the Johns Hopkins University Applied Physics Lab (JHU APL).

COVID-19 outbreak in countries with more than 10k deceased as of Apr 6 2020

Python code for this work is completely available on my GitHub, so I try to keep it here more generic.

Motion chart above is created with 5 dimensions:

  1. x-axis: number of confirmed cases
  2. y-axis: number of recovered cases
  3. Circles’ colour: Countries
  4. Circles size: number of deaths
  5. Time

Let’s get data in motion

source: www.giphy.com

Given data is being forked from JHU CSSE COVID-19 into my Github, using os library, I look into the github folder and print the contents.

Print Github folder’s contents

Among all the contents, I read_csv the three marked files into three separate pandas dataframe from which I’ve sample printed one below.

time_series_covid19_deaths_global.csv to dataframe first 5 rows (before preprocessing)

Preprocessing data

As usual, we need to prepare data for consumption. Our desired dataframe must look like the one below which has 5 columns.

I will explain the process for one out of the three data frames considering that they are all done exactly the same way.

Desirable data format for motion chart creation (after preprocessing)

Looking at the two data structures, it’s clear that we need to do some preparation. For that, let’s drop extra columns from the original data frame.

Next, we need to convert everything into columns. There are numerous ways to do that. My approach is using groupby based on country names to unify each of them into unique row. (remember that in the original data set, due to “province/state” field, same country was repeated in multiple rows).

A quick note on groupby(). It’s required to use a mathematical function for numerical columns for which I have used mean().

Now, to convert date into a column, I used transpose(), reset_index(), and finally rename().

time_series_covid19_deaths_global.csv to dataframe first 5 rows

Next, we need to melt the data frame to create another column to include country names so that we can add the values of ‘confirmed’, ‘recovered’, and ‘death’ cases as three separate columns to come up with our desirable structure.

  1. melt each data frame separately:

2. Join them by using merge one by one:

We get our desirable data frame 💪

five-columns desirable data frame

There’s one last step remained and that is drawing the actual Motion Chart. But before that, to get a better outcome, I would like to filter data based on countries with highest number of ‘Confirmed’ cases.

The output list reflects the list of countries as of this date (Apr 15th).

3. select rows:

Having the list of selected countries into a list as a set(), I used loc() function to select required fields only and save them in top_countries_df data frame.

Motion Chart

to draw the motion chart, we need to fill in the hyperparameters listed below and the rest is with the MotionChart library that we import

key: ‘Date’ would be the main drive of my chart

xaxis:chose ‘confirmed’ cases

yaxis: chose ‘recovered’ cases

Bubble size: chose the number of ‘death’s in each scenario

category: The country list

These were my preference and anyone can, definitely, make any change based on their taste.

Publish

To publish the chart, it depends on your environment. As I was using a Jupyter Notebook, I used .to_notebook(). To publish the same result on web, you need to use .to_browser().

And here is final result:

Final graph as of Apr 17 2020
Final graph including the sliding bar opened

Please comment below should you have any questions/feedback/comments.

source: giphy.com

--

--

Maziar Izadi
Analytics Vidhya

I set goals ambitiously…I take actions quickly…I write…to learn…I play music… to meditate. https://www.linkedin.com/in/maziarizadi/