Early Stopping to avoid overfitting in neural network- Keras

Upendra Vijay
Zero Equals False
Published in
2 min readSep 7, 2019

A problem with training neural networks is in the choice of the number of training epochs to use. Too many epochs can lead to overfitting of the training dataset, whereas too few may result in an underfit model.

Early stopping is a method that allows you to specify an arbitrarily large number of training epochs and stop training once the model performance stops improving on the validation dataset.

This requires that a validation split should be provided to the fit() function and a EarlyStopping callback to specify performance measure on which performance will be monitored on validation split.

model.fit(train_X, train_y, validation_split=0.3,callbacks=EarlyStopping(monitor=’val_loss’))

That is all that is needed for the simplest form of early stopping. Training will stop when the chosen performance measure stops improving. To discover the training epoch on which training was stopped, the “verbose” argument can be set to 1. Once stopped, the callback will print the epoch number.

EarlyStopping(monitor=’val_loss’, verbose=1)

Often, the first sign of no improvement may not be the best time to stop training. This is because the model may get slightly worse before getting much better. We can account for this by adding a delay to the trigger in terms of the number of epochs on which we would like to see no improvement. This can be done by setting the “patience” argument.

EarlyStopping(monitor=’val_loss’, mode=’min’, verbose=1, patience=50)The exact amount of patience will vary between models and problems. there a rule of thumb to make it 10% of number of epoch.

But when we introduce the patience, we got a problem

Suppose: in our solution, we included EarlyStopping(monitor='val_loss', patience=2) and we wanted to monitor the validation loss at each epoch and after the validation loss has not improved after two epochs, training is interrupted. However, since we set patience=2, we won’t get the best model, but the model two epochs after the best model.

So, An additional callback is required that will save the best model observed during training for later use. This is the ModelCheckpoint callback.

We can set the callback functions to early stop training and save the best model as follows:

The saved model can then be loaded and evaluated any time by calling the load_model() function.

from keras.models import load_model
saved_model = load_model('best_model.h5')
train_acc = saved_model.evaluate(trainX, trainy, verbose=0)
test_acc = saved_model.evaluate(testX, testy, verbose=0)
print('Train: %.3f, Test: %.3f' % (train_acc, test_acc))

https://medium.com/@upendravijay2/how-does-dropout-help-to-avoid-overfitting-in-neural-networks-91b90fd86b20

--

--