EDA on Blood Bank Directory of India using SQL

Shruti Patkar
Learning SQL
Published in
5 min readMar 16, 2023
Image on Pexels

The topic of this article is to conduct Exploratory Data Analysis on the ‘Blood Bank Directory of India’ and we will be using MySQL workbench for the data exploration.

The aim of doing the EDA is to understand the facts and figures related to Blood Banks in India such as the number of blood banks in different states, facilities provided, service time, category, license, contact number, etc…

The dataset that I have taken for EDA is publicly available on Open Data Platform- India. The link is here.

Before importing the dataset, here are the things to remember

Change the data type of the column Pin code, Contact Number, and Mobile to text because at some rows the cell value is not an integer but is a text — ‘NA’ or similar to that.

After importing the dataset, we will first run the below query to have a quick visual of our dataset.

Image by Author- Introduction to Dataset

Over here we can see that there are 2823 rows in the dataset

First, we will look at the total number of blood banks present in India which is showing as 2823 in the below query.

Image by Author-Total number of Blood Banks in India

Now let’s have a close look at the state-wise count of Blood Banks

Image by Author-State-wise count of Blood Banks in India

Here we can say that the state having the highest number of Blood Banks is Maharashtra with a count of 332 following Tamil Nadu with 291 and so on.

Now we will explore the service provided by Blood Banks. For this, we will first go with ‘Apheresis’ service availability.

Image by Author-Apheresis service Availability

Over here we can say that out of 332 blood banks in Maharashtra only 74 blood banks have the facility of Apheresis. Similarly, we can compare other states as well.

The important parameter we consider when looking for Blood Bank is the service time, as a medical emergency and blood requirement can occur at any given point. Let’s see the count of blood banks that are functional 24/7.

Understand that over here I have used ‘like’ function to retrieve rows that only contain 24/7 as service time.

Image by Author-24/7 Functional Blood Banks

Again, over here we can see that for Maharashtra out of 332 blood banks, only 269 are 24/7 functional whereas for Tamil Nadu it’s 250 out of 291.

Now we will look at the functional category of blood banks i.e Private, Charity, and Government.

Image by Author-Charity and Government Blood Banks count

From the above image, it’s clear that Government and Charity blood banks hold the major count in Maharashtra while for Tamil Nadu the shift is towards private blood banks.

Here I have used ‘in’ function to filter out the Blood Banks of Government and Charity category. You can change it according to your requirement.

Next, we will look at the parameter of Blood Component Availability

Image by Author-Blood Component Availability

Here we can say that the highest number of blood banks offering blood components along with whole blood remains from Maharashtra and the second state comes up as Uttar Pradesh surpassing Tamil Nadu.

Now let’s have a look at the authority aspect of the Blood Bank, let’s see the blood banks with the license details provided in the dataset.

Image by Author-License details provided

So, it’s clear that out of 332 blood banks in Maharashtra 287 have their license details provided in the dataset following 206 out of 291 in Tamil Nadu.

Now let’s create a case in which we will provide the conditions to be met and the output that needs to be shown if the criteria are met.

So here we will create a case of ‘Totally Functional’ blood banks in which we will have an additional column of Functionality Status in our dataset which tells us whether the particular blood bank is ‘Totally Functional’ or ‘Partially Functional’.

The criteria for Totally Functional Blood Bank are:

· Blood Component Availability should be Yes

· Apheresis facility should be provided

· Service time should be 24/7

· License details should be mentioned

We will see the above list for Maharashtra state so it will come under the where clause as a filter. Let’s look at the results

Image by Author-Functionality Status- Case

The above image gives us the list of blood banks with their necessary contact details and functionality status to help us in choosing the best.

Here we can sort the column by clicking on Functionality_Status.

Now we will see the states with less than 30 blood banks present.

Image by Author-States- Less than 30 Blood Banks

Here we can say that there are a total of 16 states (Note that- In this dataset, Union Territories have been considered as states only) with less than 30 blood banks.

These were the few questions that I could think of for the EDA- Blood Bank Directory. You can create as many questions as possible and frame SQL queries accordingly.

I hope you enjoyed the EDA and understood the SQL queries.

I have created a dashboard for the same project. Check it out here.

The complete project is available on my GitHub profile.

Thank You!

Image by Author-Dashboard

Go from SELECT * to interview-worthy project. Get our free 5-page guide.

--

--

Shruti Patkar
Learning SQL

Data Enthusiast in Healthcare | SQL | Excel | Power BI | R