Maximizing Convolutional Neural Network Accuracy Summary

Last Updated on February 1, 2020

Jay Jirayut Chatphet
AltoTech
3 min readFeb 1, 2020

--

Reference:

We explore several questions that tend to come up during model training:

  • Does reducing input image size have a significant effect on prediction results?
  • What is the appropriate “learning rate” to supply during model training?

How Hyperparameters Affect Accuracy

We aim to modify various parameters of a deep learning pipeline one at a time and see its effect primarily on validation accuracy. Additionally, when relevant, we also observe its effect on the speed of training and time to reach the best accuracy (i.e., convergence).

Our experimentation setup is as follows:

  • To reduce experimentation time, we have used a faster architecture — MobileNet.
  • We reduced the input image resolution to 128 x 128 pixels to further speed up training. In general, we would recommend using a higher resolution (at least 224 x 224) for production systems.
  • Learning rate is set to 0.001 with Adam optimizer.

Effect of Learning Rate

Experimental setup: Vary the learning rate between .1, .01, .001, and .0001

Effect of learning rate on model accuracy and speed of convergence

Here are the key takeaways:

  • Too high of a learning rate, and the model might never converge.
  • Too low a learning rate results in a long time taken to convergence.
  • Striking the right balance is crucial in training quickly.

Effect of Optimizers

Experimental setup: Experiment with available optimizers including AdaDelta, AdaGrad, Adam, Gradient Descent, Momentum, and RMSProp

Effect of batch size on accuracy and speed of convergence

Here are the key takeaways:

  • Adam is a great choice for faster convergence to high accuracy.
  • RMSProp is usually better for RNN tasks.

Effect of Batch Size

Experimental setup: Vary batch sizes in powers of two

Effect of batch size on accuracy and speed of convergence

Here are the key takeaways:

  • The higher the batch size, the more the instability in results from epoch to epoch, with bigger rises and drops. But the higher batch size also leads to more efficient GPU utilization, so faster speed per epoch.
  • Too low a batch size slows the rise in accuracy.
  • 16/32/64 are good to start batch sizes with.

Effect of Resizing

Experimental setup: Change image size to 128x128, 224x224

Effect of image size on accuracy

Here are the key takeaways:

  • Even with a third of the pixels, there wasn’t a significant difference in validation accuracies. On the one hand, this shows the robustness of CNNs. It might partly be because the Oxford Flowers 102 dataset has close-ups of flowers visible. For datasets in which the objects have much smaller portions in an image, the results might be lower.

--

--