Graphical view of coronavirus live update — Using python

Saicharan Kr
Analytics Vidhya
Published in
3 min readMar 26, 2020

Web Scraping data from a Table in Web Page using python

In this article, we are going to extract data from the table on a website (https://www.worldometers.info/coronavirus/) and store it into a CSV or JSON and visualize using D3.js

What is web scraping?

In simple terms, it is the process of gathering information or data from different webpages (HTML sources). The information or data thus gathered can be used in building datasets or databases for different applications like (Data Analysis, Building a price comparison application, etc. )

Prerequisite:-

1. Basic understanding of Python 3.0 programming.

2. Python 3.0 or above installed in your pc(Don’t forget to ADD python to the path while installing).

Libraries we are using:-

1. BeautifulSoup.

2. Pandas.

3. Requests.

The following are the steps to proceed with the project.

Step-1:- Creating the Virtualenv( Same for Windows and Linux ).

Creating the Virtualenv enables us to make our project independent (we install all the libraries required for this project into this Virtualenv.)

#Upgrading pip

python -m pip install — upgrade pip

#installing Virtalenv

pip install virtualenv

#creating Virtualenv

virtualenv [Name of environment] #enter the name of env without [].

Ex:- virtualenv env

Step-2:- Activating the Virtualenv and installing the required libraries.

Windows:-

If required

( Open Windows PowerShell as administrator and ‘Set Access for activating env in PowerShell window By below command.)

Set-ExecutionPolicy RemoteSigned

Now to activate the env :-

env/Scripts/activate

Activating env in PowerShell window

Now if the env is activated you will See (env) at the beginning of the next line.

In Linux(env/bin/activate)

Installing Required Libraries:-

#installing BeautifulSoup

pip install bs4

#installing pandas.

pip install pandas

#installing requests.

pip install requests

It is always best practice to freeze required libraries to requirements.txt

pip freeze > requirements.txt

Step 3:- Open web page and navigate to the table you want to collect data from > right-click > click on Inspect.

Understand the HTML structure now.

Step 4:- now proceed with the program.

D3.js Chart template:-

D3.js Chart template

Python Programming:-

Python Programming

D3.js image output

D3.js image output

Data.json Output file

Data.json Output file

Refer Code in GitHub here:- https://github.com/saicharankr/WebScrap

Originally published at https://just-python.blogspot.com.

--

--

Saicharan Kr
Analytics Vidhya

FULL STACK JAVASCRIPT DEVELOPER | .NET CORE | PYTHON | DATA SCIENCE