Data Science 365

Bring data into actionable insights.

Member-only story

Altering the Sampling Distribution of the Training Dataset to Improve Neural Network’s Accuracy

Rukshan Pramoditha
Data Science 365
Published in
4 min readDec 7, 2022

--

Photo by William Felker on Unsplash

Who doesn't like to improve the accuracy of a neural network?

There are many ways to improve the accuracy of a neural network. In this article, we will focus on changing the sampling distribution of the training dataset to improve the accuracy of a neural network.

During the training of a neural network, data is passed through the layers as batches.

At the start of each epoch, the training data is randomly shuffled if we set shuffle=Truein the fit() method. However, the sampling distribution of the training data will still be the same.

So, having the same sampling distribution at each epoch might not represent the population distribution well. That will cause a decrease in the neural network’s accuracy.

To avoid this, we can try one of the following methods which can automatically alter (change) the sampling distribution of the training dataset while giving other main benefits such as preventing overfitting.

Altering the Sampling Distribution
----------------------------------
01. Dropout Regularization
02. Data (Image) Augmentation
03. Batch Normalization

--

--

Rukshan Pramoditha
Rukshan Pramoditha

Written by Rukshan Pramoditha

3,000,000+ Views | BSc in Stats (University of Colombo, Sri Lanka) | Top 50 Data Science, AI/ML Technical Writer on Medium

No responses yet