Machine Learning Series: regression-2 (Data Visualization)

Arun
Geek Culture
Published in
2 min readAug 8, 2021

Previously we saw how to use linear regression to plot a straight line that could predict the co-relation between diabetes disease progression with body mass index and blood pressure. The dataset that we got from Scikit-learn was prepared for model building beforehand, in reality datasets do not come prepared like that for an effective model-building. We need to prepare the dataset and use visualizing techniques to actually transform dataset into something that could be used by our machine learning model effectively. The quality of the result the model produces heavily depends on the dataset that we use.

Example — Pumpkin Data Set

We are going to once again turn to an example dataset to see how data is prepared, analyzed and visualized. This exercise is inspired from the GitHub tutorial. Get the csv from this link.

In this example we used scatter plot and bar chart to visualize data. There various types of visualization techniques that can be used to comprehend the data more easily.

“ Data Visualization is very important in understanding the nature of the dataset we are working with“

Here is a useful link for data visualization!

--

--

Arun
Geek Culture

I am just a being, striving to find the purpose of it all. Alas there is none!