Understanding Loss Function and Error in Neural Network

Published in

Udacity PyTorch Challengers

4 min readJan 7, 2019

Loss function helps us to quantify how good/bad our current model is in predicting some value which it is trained to predict. This article aims you to explain the role of loss function in neural network.

I assume that you have basic understanding of Neural Networks and Pytorch library. Throughout this tutorial, I will use google colab to write and test the code. I will be using Dog vs Cat dataset from kaggle. You can download it from here.

Cat Images from kaggle dataset

The dataset that I downloaded looks like this. I will be using cat images only.

Let’s Get Started!

I will start my explanation with an example of a simple neural network as shown in Figure 1 where x1 and x2 are inputs to the function f(x). The output y_hat is the weighted sum of inputs passed through the activation function. I have omitted bias for the sake of simplicity.

What is a loss function?

Let’s prepare a training dataset of cats along with true labels for each input image and pass it to our model.

train_transforms =transforms.Compose([
                         transforms.RandomResizedCrop(224),
                         transforms.ToTensor(),
                         transforms.Normalize([0.485, 0.456, .406],
                                              [0.229, 0.224,.225])])
train_data = datasets.ImageFolder(data_dir + '/train', transform=train_transforms)
train_loader = torch.utils.data.DataLoader(train_data, batch_size=64, shuffle=True)

Following is Densenet, a pretrained model that we will use for this tutorial.

model = models.densenet121(pretrained=True)
classifier = nn.Sequential(nn.Linear(2048, 512),   
                           nn.ReLU(),
                           nn.Dropout(p=0.2),
                           nn.Linear(512,2),
                           nn.LogSoftmax(dim=1)
)

All the training images will be passed to this model in batches:

for epoch in range(epochs):
  for inputs, labels in train_loader:
    logps = model.forward(inputs)

For each batch, we compare the output from the model (predicted values) with the actual labels. To explain in simple words, we check if the image predicted as cat by the model is actually a cat or not. If all the predictions are correct, then our model is good. But if model identifies dog as a cat, then we have an error. This is what the loss function calculates.

Loss function takes in the predicted score coming out from our model together with true target or labels, compares it and gives us some quantitative value of how good or bad those predictions are for the training images.

criterion = nn.NLLLoss()logps = model.forward(inputs)
batch_loss = criterion(logps, labels)

There are different types of loss functions that we can use. Choosing the optimal loss function depends on various factors and the type of problem we are trying to solve such as regression or classification. In this tutorial, we are using NLLLoss function.

How do we minimize this error?

Once we calculate the loss, our next step is to decide whether to continue training or not . The decision is based on the loss value. If the loss value is very high, then we need to update the weight parameters of the model and repeat the same process. If the loss is low and satisfies our need then we stop the training process.

Based on the calculated error, the optimization functions are used to optimize the weights using gradient descent and back propagation.

#compute the  gradients
loss.backward()#update the parameters
optimizer.step()

In this way, we modify the weight parameters of the model and repeat the same process until the loss is reduced and it is minimal.

Output

Yay! We have completed the training part for our model. The calculated loss per batch looks like this:

The working code for this tutorial can be downloaded from here.

Summary

The main objective of this tutorial is to give detailed explanation on error calculation and loss function rather than how to train a model. So we examined and observed the error calculation and optimization step in detail, while training our model. However, this tutorial doesn’t explain all the steps involved in training process of neural network. I hope this tutorial adds some value to help you understand the importance and role of loss function in neural network.

References

https://isaacchanghau.github.io/post/loss_functions/
https://seba-1511.github.io/tutorials/beginner/blitz/neural_networks_tutorial.html
https://heartbeat.fritz.ai/5-regression-loss-functions-all-machine-learners-should-know-4fb140e9d4b0
https://www.youtube.com/watch?v=h7iBpEHGVNc