P05 — Week 6 — EDA and Machine Learning models

Bengü Barış Balkan

Follow

Published in

AIN311 Fall 2023 Projects

2 min readDec 28, 2023

--

This week, we focued on the machine learning algorithms and their evaluations.

As we mentioned last week, our data was ready to use in machine learning models. However, we didn’t have much insight over the data. This week, we started with the exploratory data analysis.

First, we wanted to see if there’s any outliers. So we used a boxplot to detect them.

As seen in the Figure 1, we found two samples in laboratory (left) and real world measurements (right). We dropped those samples from the data.

After pruning the outliers, we headed to normalize our data. Firstly we used Min-Max scaler but since our data has high range in most of the features (Fig. 3), we switched to Standart scaler.

Then, we checked for highly correlated features (Fig. 4), since they could affect the models’ performances.

In our data, steering axle width and other axle width features had almost perfect correlation. So, we’ve dropped the other axle width feature to avoid any unwanted bias.

Applying Models

To test our data, we’ve used several models. To be exact, we’ve used:
1- Linear regression
2- KNN regressor
3- Multivariate regression
4- Ridge regression
5- Lasso regression
6- Support vector regression
7- Desicion tree regressor

Results

Among all the models, KNN regression gave the best result with 34.12 root mean squared error score.

Next week, we’ll be focusing on polishing our models and our project’s final report.

P05 — Week 6 — EDA and Machine Learning models

Applying Models

Results

Written by Bengü Barış Balkan