The problem of overfitting and solution
2 min readMar 9, 2022
•Too closely or exactly to a particular set of data,
- This, therefore, fail to fit additional data or predict future observations reliably.
Cross-Validation
•Involves dividing data into a training set and a test set.
•Fit the model parameters on the training set and evaluate performance on the test set.
•k-fold cross-validation:
•Data is first partitioned into k equally (or nearly equally) sized segments or folds.
- Subsequently, k iterations of training and validation are performed such that within each iteration a different fold of the data is held-out for validation while the remaining k — 1 folds are used for learning.
Early stopping
•Stop Training When Generalization Error Increases
•There are three elements to using early stopping; they are:
•Monitoring model performance.
•Trigger to stop training.
- The choice of model to use
Bias-Variance trade-off
- Bias is the expectation in error and variance is the variability in the model.
- Finding the proper balance between two