Calculating the Reproduction Number (Ro) of COVID-19 in India and Visualizing the same using Geopandas and Matplotlib

Data Dater
Analytics Vidhya
Published in
3 min readApr 6, 2020

Introduction

The disease COVID-19, caused by the novel coronavirus SARS-CoV-2, reached Indian shores in late January through Italian tourists. Since then, it is confirmed to have infected at least 4289 lives as of 5 April, 2020. Governments across the country have intervented by asking its residents to social distance themselves from others through a lockdown, testing of symptomatic individuals and tracing their contacts.

In epidemiology, the basic reproduction number of an infection (Ro — pronounced R-naught) can be thought of as the expected number of cases directly generated by one case in a population where all individuals are susceptible to infection. The definition describes the state where no other individuals are infected or immunized.

In this article, the python code to compute Ro for each state/union territory of India, and visualize the same, is explained.

Source of Data

Time series data of confirmed cases for each state is obtained from the google doc hosted by the case tracking site https://covid19india.org.

Statewise Time Series of COVID-19 cases in India

Calculating Ro

Ro is calculated by the formula:

EXP((LN(Population/((1/(case[current]/(case[start]*Population)))-1)))/(current-start))
where,
a. LN is the natural logarithm;
b. case[start] is the case count on starting day;
c. case[current] be the case count on current
d. Population is the population of the state/UT
e. current = no. of days from the start of the outbreak and start = 0

Python Code

Importing the case time series file

At first the case time series file is imported.

Importing the case time series file

Declaring the hyperparameters

Next, the hyperparameters are set. These are the lookback value, based on which the Ro is calculated and the population of India.

Declaring the hyperparameters

Calculating Ro for each State/UT

Lists are declared, where the names of states/UTs and their Ro is stored. Also, based on the computed Ro, a prediction of the number of cumulative cases is made for the next day.

Calculating Ro for each state/UT and predicting the next day’s case count

Writing the Ro to DataFrame

The names of the states/UTs, their Ro and predicted cumulative cases are written to a new dataframe which will be used for geospatial visualization.

Writing the Ro to Dataframe
Contents of the Dataframe

Importing Geopandas and Shapefile

In order to visualize the Ro for each state/UT, the geopandas library is downloaded using the command:

pip install geopandas

If you encounter difficulties while downloading the package due to incompatible dependencies, then this guide should help.

Next, the shapefile for India is downloaded from here.

Site for downloading free geographic data

The shapefiles will be downloaded as a zipped folder containing various categories of files. The files of format .shp, .shx, .dbf, .prj and .cpg, need to be extracted to the working directory.

Importing Geopandas and Shapefile

Merging the geodataframe and DataFrame

The geodataframe and dataframe are merged on the feature of state/UT names.

Merging dataframes

Generating and Displaying the Chloropleth by the Ro

Finally, the chloropleth map is built on the basis of Ro and displayed.

Generating the Chloropleth
Displaying the Chloropleth

The raw files, code and notebook referred to in this article can be found at this github repo.

— — -BEGIN BITCOIN SIGNED MESSAGE — — -
Calculating the Reproduction Number (Ro) of COVID-19 in India and Visualizing the same using Geopandas and Matplotlib
— — -BEGIN SIGNATURE — — -
1HZwkjkeaoZfTSaJxDw6aKkxp45agDiEzN
HCCN5IwW2xsD/cumL0iUowS4ZFyOQmTa3Qx2YqOykwD2SICi2CqkLFhrkngcIiyA/0uisK8F+TuR0++/OkhMBuA=
— — -END BITCOIN SIGNED MESSAGE — — -

--

--