Day 62 of 100DaysofML

Published in

100DaysofMLcode

4 min readAug 27, 2020

Heart Disease plots using Bubbly. So for today’s blog, I thought of picking up a random dataset and creating unique plots which would help out with visualization and help us analyze our data better. So I decided to make use of a package in python called Bubbly which is mainly used in order to plot interactive and animated bubble based charts.

I shall explain the code along with the implementation, so I would recommend installing the given packages in order to keep up with the implementation.

!pip install bubbly
!pip install pandas-profiling

Pip should install the dependencies if you are using a virtual environment such as anaconda or even on kaggle. Let us start with importing all the given libraries.

import numpy as np
import pandas as pd
import pandas_profiling
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.offline as py
from plotly.offline import init_notebook_mode, iplot
import plotly.graph_objs as go
init_notebook_mode(connected = True)
from bubbly.bubbly import bubbleplot
import warnings
warnings.filterwarnings('ignore')

The next step would be to import the dataset into the given environment.

data = pd.read_csv('../input/heart.csv')
data.shape

Size of heart dataset

If you want to understand the dataset, you can run the below given commands to familiarize yourself with the dataset.

data.head()
data.describe()

Pandas profiling is a whole different topic which I shall be covering in my upcoming blogs but for today’s blog, I shall be covering the essentials.

Pandas profiling is an open source Python module with which we can quickly do an exploratory data analysis with just a few lines of code. Besides, if this is not enough to convince us to use this tool, it also generates interactive reports in web format that can be presented to any person, even if they don’t know programming.

Check out the below given link to understand more about pandas profiling.

Exploratory Data Analysis with Pandas Profiling

Pandas Profiling, the perfect tool for exploratory data analysis.

towardsdatascience.com

Run the below given commands to complete the PANDAS PROFILING for the given dataset. I haven’t taken screenshots of the given profiling of this given dataset but I have attached a sample below of what the sampling looks like.

profile = pandas_profiling.ProfileReport(data)
profile

Now the last and most important step would be to create the bubbly plot.

figure = bubbleplot(dataset = data, x_column = 'trestbps', y_column = 'chol', 
    bubble_column = 'sex', time_column = 'age', size_column = 'oldpeak', color_column = 'sex', 
    x_title = "Resting Blood Pressure", y_title = "Cholestrol", title = 'BP vs Chol. vs Age vs Sex vs Heart Rate',
    x_logscale = False, scale_bubble = 3, height = 650)py.iplot(figure, config={'scrollzoom': True})

In the given code, we are making use of the bubble plot which we have installed using pip. The code takes in the dataset as data and the given x_column as well as y_column define the column names which are being used along the x axis as well as y axis of the given plot. Each of these bubbles refer to a set of values which are defined by the bubble_column, time_column, size_column and the color_column. For example, if the color_column is given as ‘sex’, then the different sex orientations are represented by different colors and along the time axis, if we pass ‘age’ as a parameter, we can check the variation of age along with time which would be represented by different bubbles. The plots are shown below:

The following method is a great way to visualize data and is openly available as a library on pypi website. Anyways that's it for today. Thanks for reading. Keep Learning.

Cheers.

Day 62 of 100DaysofML

Exploratory Data Analysis with Pandas Profiling

Pandas Profiling, the perfect tool for exploratory data analysis.

Written by Charan Soneji