How Good is Your Model? — Intro To Machine Learning #4

Published in

Simple AI

4 min readFeb 6, 2017

Hi, this is the Third Article on our journey about Machine Learning Algorithms, You can find here Part 1, Part 2, Part 3 (depending on your background that might not be required for reading this article).

In this post I’m going to show you two simple ways you can use to evaluate your machine learning model.

Agenda:

Why model evaluation?
Problems with training and testing on the same data
Train / test Split Method
Cross validation method
Comparing cross-validation to train/test split

Why Model Evaluation

Whenever you have problem and you want to solve it using machine learning, one thing you will have to ask yourself after choosing your machine learning model is: how good or bad is your model? The reason for that is to know how it will perform when used to make predictions on unseen data.

Problems with Training and Testing on the same Data

Goal is to estimate likely performance of a model on out-of-sample data, But, maximizing training accuracy rewards overly complex models that won’t necessarily generalize, Unnecessarily complex models overfit (also called over-learning) the training data, overfiting happens when the model learn specific details in noise data.

Lets take the image above as an example, the lines represent decision boundaries, the lines separate the positive examples (red) from negative examples (blue).

The green line is likely to perform poorly on out-of-sample data (unseen data) because it learns that noise data points while the black while can be considered the best as it doesn't follow noise data points which is good to generalize the training data. Overfiting is a big deal in machine learning and it can be a bit difficult to understand at first but you will get it.

Evaluation Methods

Evaluation procedure (1): Train/test split

Split the dataset into two pieces: a training set and a testing set.
Train the model on the training set.
Test the model on the testing set, and evaluate how well we did.

sklearn train/test procedure

Evaluation procedure (2): K-fold cross-validation

Split the dataset into K equal partitions (or “folds”).
Use fold 1 as the testing set and the union of the other folds as the training set.
Calculate testing accuracy.
Repeat steps 2 and 3 K times, using a different fold as the testing set each time.
Use the average testing accuracy as the estimate of out-of-sample accuracy.

sklearn KFold

Comparing Cross-validation To Train/Test Split

Advantages of cross-validation:

More accurate estimate of out-of-sample accuracy
More “efficient” use of data (every observation is used for both training and testing)

Advantages of train/test split:

Runs K times faster than K-fold cross-validation
Simpler to examine the detailed results of the testing process

Which One Is the Best?

I guess you know the answer for that by now, The Best Method? It depends on your dataset and the computational resources available to you, on large datasets cross validation is computationally expensive to run (trust me, it can be boring).