COVID-19 Data Analysis using Data Science in Python

Ashita Saxena
Analytics Vidhya
Published in
3 min readJun 8, 2020

--

As we know that Coronavirus disease 2019 (COVID-19) is an infectious disease caused by severe acute respiratory syndrome coronavirus2(SARS-CoV-2).

This pandemic has caused global social and economic disruption, including the largest global recession since the Great Depression.

In this article, we are going to analyse the Covid-19 data using Python and some graphing libraries, you can project the total number of confirmed cases of COVID-19, and also display the total number of deaths for a country (this article uses India as an example) on a given date. Humans sometimes need help interpreting and processing the meaning of data, so this article also demonstrates how to create a graph for various parameters.

Let’s get started:

STEP-1: Import the following libraries:

import pandas as pd
from matplotlib import pyplot as plt
from collections import OrderedDict

STEP-2: Read the data from any sample source:

Here, I am using the random dataset on COVID-19 that is stored in an excel file with .xlsx extension.

df = r"C:\Users\xyz\file\Covid19.xlsx"
d = pd.read_excel(df)

This COVID-19 record contains the following fields:

1- Date

2- Name of State/UT

3- Total confirmed cases(Indian National)

4- Total confirmed cases(Foreign National)

5- Cured/ Discharged/ Migrated

6- Latitude

7- Longitude

8- Death

9- Total Confirmed Cases

Let’s read the first five rows:

d.head()
The above output denotes the first five rows of the COVID-19 record

STEP-3: Drop the unnecessary columns that are not required:

From the above record we will we analyzing data on the basis of 3 fields, i.e, Date, Name of States/ UT, and Total Confirmed Cases. So, drop the other columns.

d.drop(['Latitude','Longitude','Death','Total Confirmed cases (Indian National)','Total Confirmed cases ( Foreign National )','Cured/Discharged/Migrated'],axis = 1,inplace =True)d.head()
Output

STEP-4: Create a dictionary for storing Dates and Name of States/UT:

This is done in order to obtain the total confirmed cases state-wise on each and every date.

Now, let’s dive into code:

After performing the above code, the output will be as follows:

STEP-5: Finding the new cases arising everyday state-wise:

This can be done by creating a list containing the dates and total confirmed cases:

With the help of the above code, we can get the new cases as follows:

STEP-6: Constructing a Graph:

For each date, you should have the number of cases on that date stored in vector N.

After that for each date calculate the moving average from A0=0.

Final Output:

Graph showing Moving Average with date for different values of Beta

Conclusion:

In this way we can analyse the dataset using Python and can estimate other the factors as well.

I hope that now it would be easy for you to create , analyse, and monitor the consequences and effects of COVID-19.

THANK YOU!!

KEEP LEARNING!!✌

--

--