The problem of overfitting and solution

Muneeb Ali
2 min readMar 9, 2022

--

•Too closely or exactly to a particular set of data,

  • This, therefore, fail to fit additional data or predict future observations reliably.

Cross-Validation

•Involves dividing data into a training set and a test set.

•Fit the model parameters on the training set and evaluate performance on the test set.

•k-fold cross-validation:

•Data is first partitioned into k equally (or nearly equally) sized segments or folds.

  • Subsequently, k iterations of training and validation are performed such that within each iteration a different fold of the data is held-out for validation while the remaining k — 1 folds are used for learning.

Early stopping

•Stop Training When Generalization Error Increases

•There are three elements to using early stopping; they are:

•Monitoring model performance.

•Trigger to stop training.

  • The choice of model to use

Bias-Variance trade-off

  • Bias is the expectation in error and variance is the variability in the model.
  • Finding the proper balance between two

--

--