What is overfitting?

Avinash
Udacity PyTorch Challengers
2 min readJan 5, 2019

--

If you have just started learning Machine Learning, you will often hear this term called overfitting, but what exactly it is?

Image from Lesson 2, Overfitting and Underfitting from Udacity’s Intro to Deep Learning with PyTorch

Before diving into overfitting, let’s first review what really is Machine Learning. In ML, we define a mathematical model, which goes through a bunch of our data, say like hundreds of pictures of flowers and comes up with a way to predict them. Now, whenever, we show a picture of a flower, it will guess which category it belonged to.

But what if it sees an entirely new flower image, which it had never seen before? Would it predict that too?

A good ML model can make near accurate predictions on the new data. This is called generalisation. The entire goal is that, it sees a bunch of images, generalises it so well that it can predict the new ones.

Now, when you are training your model with training data, you might end up training it too well. But what does that mean? The ML recognises by seeing the features from the image. In case of flowers, it might see curves, lines, colour patterns to recognise a flower. Now, when you overtrain it, the ML model gets used to the same patterns. And it cannot guess anything outside of it. Imagine when you are playing with a kid, you are teaching the kid about circles. Small ones, big ones, red ones etc. But if you show a square, the kid won’t figure out. Because it is overtrained on the circles.

The overfitting is the exactly same. When our model gets trained too much, it cannot generalise well. It cannot make good predictions, outside of the training data.

The following video from Udacity’s Intro to Deep Learning with PyTorch does a great job at explaining what are underfitting and overfitting:

--

--