Regularization — A Technique Used to Prevent Over-fitting

Kiprono Elijah Koech
8 min readDec 17, 2020
Source: https://unsplash.com/photos/3SYi9YfTXdU

The very essence of any machine learning project is to end up with a model that performs well on unseen data (test data). In some cases the model attains high accuracy on the training set but yield poor predictive performance in the test set — a case of over-fitting.

In over-fitting, the model describes random error (noise) instead of the underlying relationship.

Figure 1: Source: Author

Bias-Variance Trade-off

Bias: Bias is an error introduced in the model due to the oversimplification of the algorithm used (does not fit the data properly). High bias can lead to under-fitting.

Variance: Variance is error introduced in the model due to a too complex algorithm, it performs very well in the training set but poorly in the test set. It can lead to high sensitivity and over-fitting.

Normally, as you increase the complexity of your model, you will see a reduction in error due to lower bias in the model. However, this only happens until a particular point. As you continue to make your model more complex, you end up over-fitting your model and hence your model will start suffering from high…

--

--