COVID-19 Data Analysis using Data Science in Python
As we know that Coronavirus disease 2019 (COVID-19) is an infectious disease caused by severe acute respiratory syndrome coronavirus2(SARS-CoV-2).
This pandemic has caused global social and economic disruption, including the largest global recession since the Great Depression.
In this article, we are going to analyse the Covid-19 data using Python and some graphing libraries, you can project the total number of confirmed cases of COVID-19, and also display the total number of deaths for a country (this article uses India as an example) on a given date. Humans sometimes need help interpreting and processing the meaning of data, so this article also demonstrates how to create a graph for various parameters.
Let’s get started:
STEP-1: Import the following libraries:
import pandas as pd
from matplotlib import pyplot as plt
from collections import OrderedDict
STEP-2: Read the data from any sample source:
Here, I am using the random dataset on COVID-19 that is stored in an excel file with .xlsx extension.
df = r"C:\Users\xyz\file\Covid19.xlsx"
d = pd.read_excel(df)
This COVID-19 record contains the following fields:
1- Date
2- Name of State/UT
3- Total confirmed cases(Indian National)
4- Total confirmed cases(Foreign National)
5- Cured/ Discharged/ Migrated
6- Latitude
7- Longitude
8- Death
9- Total Confirmed Cases
Let’s read the first five rows:
d.head()
STEP-3: Drop the unnecessary columns that are not required:
From the above record we will we analyzing data on the basis of 3 fields, i.e, Date, Name of States/ UT, and Total Confirmed Cases. So, drop the other columns.
d.drop(['Latitude','Longitude','Death','Total Confirmed cases (Indian National)','Total Confirmed cases ( Foreign National )','Cured/Discharged/Migrated'],axis = 1,inplace =True)d.head()
STEP-4: Create a dictionary for storing Dates and Name of States/UT:
This is done in order to obtain the total confirmed cases state-wise on each and every date.
Now, let’s dive into code:
After performing the above code, the output will be as follows:
STEP-5: Finding the new cases arising everyday state-wise:
This can be done by creating a list containing the dates and total confirmed cases:
With the help of the above code, we can get the new cases as follows:
STEP-6: Constructing a Graph:
For each date, you should have the number of cases on that date stored in vector N.
After that for each date calculate the moving average from A0=0.
Final Output:
Conclusion:
In this way we can analyse the dataset using Python and can estimate other the factors as well.
I hope that now it would be easy for you to create , analyse, and monitor the consequences and effects of COVID-19.
THANK YOU!!
KEEP LEARNING!!✌