The idea of Overfitting in Neural Networks
Consider the situation Of Three types of students preparing for an exam and then they give a test.
- First (let’s call him peter) : he didn’t prepare well for the exam and perform badly in the test.
- second (let’s call her julia): she memorized everything word by word but then also he didn’t perform that good (but better than the first guy).
- Third (let’s call him van): he conceptualized things and related the concepts learned in a broader sense by linking it to real life actions and thus he not only memorized words but also understood them.
Now what was the reason for Van scoring “A” and other two performed badly one of whom performed poorly getting “F” and other “C”.
There can be various factors but the one which seems prominent here is that Van didn’t just memorized the words or paragraphs for the test but also made sure that he comprehends the broader idea in his mind by conceptualizing the things he learned and linking it to his observations in real life.
That made Van to think more critically and logically because he has diversified the ideas he learned , so whenever he will see a new kind of question which he has never seen before , he will able to attempt with same fundamentals with some tweaks and tricks.
Same example can be extended to another example where we attempt to observe a person who wants to get his license and for that he goes to driving school and he was trained by three different trainers.
Case 1 : When the trainer was not good at his job and didn’t imparted his true knowledge (even basic knowledge) to drive a car.
Case 2 : When the trainer made the person to drive along with him everyday but never left him along on his own.
Case 3 : When the trainer made the person to drive along with him for first few days and when he felt that the person was learning , then he starts leaving him along for drives firstly at alternate days and then regularly finally having a good end resulting in person having a good confidence in general.
From above two examples , we learned that if a object is trying to learn something then rote learning will never work , and one has to generalize the concepts learnt , then only it can perform better in real world.
Let’s now apply this to neural networks , when a neural network tries to learn from given features (inputs) through backpropogation algorithm , then it sometimes happens that network learns and memorizes and becomes too specific for the given examples of training data.
This means that it works well for the data in the training set but works badly when same model is worked upon on testing data. This happens because the network hasn’t learn the general features and failed to recognize and identify the fundamental entities that make up the final picture (output) thus not performing better in general.
Let’s understand with the following example of classifier :
The classifier makes many mistakes in the first picture because it didn’t learn well which is called underfitting , and then in second picture it performed pretty good by making negligible mistakes and then in third picture it made 0 mistakes but it was being too specific that means in general it will perform badly on testing data which is called overfitting.
See the following model complexity graph which indicates testing error for testing dataset and testing error for testing dataset as no of epochs increases.
As we can see in the start (less number of epochs) , the error rate for both training and testing was decreasing due to network learning better and better but after a certain point (known as goldilocks point) the network becomes too specific to the training dataset and so performs better on training dataset and so the error is low but performs badly in general over the testing data and hence error rate starts increasing.
This finishes our concept of overfitting in neural networks.
— — — — — — — — — — — — — — — — — — — — — — — -
images : copyright reserved with their respective owners.
For more such awesome stories , you can subscribe or follow me.
Feel free to share your insight on overfitting of neural networks in the comments section below.
— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —
— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —