Model Validation

Published in

unpack

3 min readMar 8, 2021

Model validation is the process of evaluating a trained model on test data set. This provides the generalization ability of a trained model. Using proper validation techniques helps you understand your model and it’s generalization performance.

Let’s talk about some most popular model validation techniques.

3-way holdout method of getting training, validation and test data sets

If all the data is used for training the model and the error rate is evaluated based on outcome vs. actual value from the same training data set, this error is called the resubstitution error. To avoid the resubstitution error, the data is split into two different datasets labeled as a training and a testing dataset. This can be a 60/40 or 70/30 or 80/20 split.

When optimizing the hyperparameters of your model, you might overfit your model if you were to optimize using the train/test split. That’s because the model searches for the hyperparameters that fit the specific train/test you made. To solve this issue, you can create an additional holdout set. This is often 10% of the data which you have not used in any of your processing/validation steps.

k-fold cross-validation with an independent test data set

Instead of making a single split, we can also make many splits and validate all combinations of those splits. This is the k-fold cross-validation technique, and here It splits the data into k folds, then trains the data on k-1 folds and tests on the one fold that was left out. It does this for all combinations and averages the result on each instance.

The advantage is that the entire data is used for training and testing. The error rate of the model is the average of the error rate of each iteration. We typically choose either i=5 or k=10 as they find a nice balance between computational complexity and validation accuracy.

This technique can also be called a form of the repeated hold-out method. The error rate could be improved by using the stratification technique.

Leave-one-out cross-validation with independent test data set.

A variant of k-Fold CV is Leave-one-out Cross-Validation. It uses each sample in the data as a separate test set while all remaining samples form the training set. This variant is identical to k-fold CV when k = n (number of observations).

It is computationally very costly as the model needs to be trained n times. Only do this if the data is small or if you can handle that many computations.

There are other techniques that we won’t discuss here, but they worth mentioning:

Nested Cross-Validation
Time Series Cross-Validation
Random Subsampling
Bootstrapping, and more

Which model to pick

There is an interesting perhaps a bit simplified estimation about which techniques are better to use for which datasets among 3 most used ones:

There are also many methods that apply statistics to the selection of Machine Learning models which rely on comparing accuracies of different techniques, like Wilcoxon signed-rank test, McNemar’s Test, 5x2CV paired t-test, and so on.

Conclusion

This is a short introduction to a series of model validation techniques and the ways we decide which ones are better for our particular dataset. It requires a good understanding of a dataset at hand to make a choice, but in the end — working with and knowing your data is one of the most important things in deep learning anyways.

Sources

https://dzone.com/articles/machine-learning-validation-techniques

https://towardsdatascience.com/supervised-machine-learning-model-validation-a-step-by-step-approach-771109ae0253

https://towardsdatascience.com/validating-your-machine-learning-model-25b4c8643fb7