Geographical Mapping to Visualize Covid-19 Cases in India

Ashita Saxena
Analytics Vidhya
Published in
4 min readJun 13, 2020

The extraction of actionable insights from raw data is what we call as Data Science. It is an inter-disciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from many structural and unstructured data.

Coronavirus or Covid-19 needs no introduction. It has already been declared as a pandemic by WHO and in past couple of weeks it’s impact has been deleterious. In this article i am going to tell, how one can visualize the Covid-19 data set using geographical plots and can track Covid-19 spread in India using python.

AIM: To plot the states having Coronavirus cases in India on an Indian Map using Python.

Packages used:

1- Pandas

2- Matplotlib

3- Seaborn

4- GeoPandas

Collecting the data:

1- For Covid-19 dataset , i have downloaded the .csv file containing latest Covid-19 record statewise in India.

2- For getting Indian Map , i have downloaded Indian States shape file from www.igismap.com.

Let’s jump into Code:

Step-1: Import necessary libraries →

import seaborn as sns
import geopandas as gpd
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

Step-2: Import the Shape File →

After importing the shape file , the output will be as follows:

Step-3: Plot the map →

Step-4: Now, import the Covid-19 .csv file that you have downloaded→

Step-4: Drop the unnecessary columns from the dataset as we are only concerned with the states and confirmed cases →

d.drop(['Recovered','Deaths','Active','Last_Updated_Time','Migrated_Other','State_code','Delta_Confirmed','Delta_Recovered','Delta_Deaths','State_Notes',],axis = 1,inplace =True)

Following will be the Output:

Record containing only states and confirmed cases

Step-5: This is the most important step as in this step we have to join both the datasets to obtain the required result →

Using join operation to join both dataframes

Since the above output contains NaN values, so i am performing Data cleaning on this dataset which is obtained, inorder to replace the NaN values with 0.

Data is now Cleaned

Step-6: Finally, write the code for plotting the graph →

The following output will be obtained:

Map showing the highest Coronavirus affected state in India.

The complete code is as Follows:

import seaborn as sns
import geopandas as gpd
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
sns.set_style('whitegrid')#importing the shape file:fp = r"C:\Users\LENOVO\Downloads\Indian_States.shp"
map_df = gpd.read_file(fp)
map_df
map_df.plot()
# importing the covid-19 dataset csv file:df=r"C:\Users\xyz\Downloads\datasets_549966_1241285_state_level_latest.csv"
d = pd.read_csv(df)
# dropping unwanted columns:d.drop(['Recovered','Deaths','Active','Last_Updated_Time','Migrated_Other','State_code','Delta_Confirmed','Delta_Recovered','Delta_Deaths','State_Notes',],axis = 1,inplace =True)d.head()# joining the dataframes:merged = map_df.set_index('st_nm').join(d.set_index('State'))
merged.head()
# data cleaning:merged["Confirmed"].fillna(0, inplace = True)
print(merged)
# plot the graph:fig, ax = plt.subplots(1, figsize=(10, 6))
ax.axis('off')
ax.set_title('Covid 19 Data 2020 State Wise', fontdict={'fontsize': '25', 'fontweight' : '3'})
merged.plot(column='Confirmed', cmap='PuBu', linewidth=0.8, ax=ax, edgecolor='0.5', legend=True)

Hence, with the help of above code one can easily track and visualize the Covid-19 Cases in Different states of India by plotting it on the Map.

THANK YOU!!😊

--

--