Loss Functions Unraveled

8 min readAug 15, 2023

Part 4: Python Walkthrough of Loss Functions

Python implementation of loss functions:

In Keras, you can use various loss functions by specifying them when compiling the model.

Here is an example of how to use mean squared error (MSE) as the loss function in a neural network:

model.compile(loss='mean_squared_error', optimizer='adam')

To use a different loss function, simply replace the loss string with the appropriate value.

To evaluate the model’s performance using the MSE loss, you can use the evaluate function:

score = model.evaluate(x_test, y_test)
print('Test MSE:', score)

The resulting score variable represents the MSE for the test dataset.

Creating custom loss functions:

TensorFlow provides a wide range of pre-built loss functions, there may be situations where a custom loss function is needed to better suit the specific requirements of a particular problem. When creating a custom loss function in TensorFlow, you define a function that takes in two arguments, the true labels (y_true) and the model’s predictions (y_pred), and then calculates the loss based on these inputs and returns the loss value. The following code snippet demonstrates how to define a custom loss function.

import tensorflow as tf

def custom_loss(y_true, y_pred):
    squared_difference = tf.square(y_true - y_pred)
    return tf.reduce_mean(squared_difference, axis=-1)

In this example, we define a custom loss function that computes the mean squared error between the predicted and true values. We first compute the squared difference between the predicted and true values using the tf.square operation. We then take the mean of the squared difference using the tf.reduce_mean operation. This custom loss function can be used as a loss function in a TensorFlow model by passing it as an argument to the compile method of a model:

model.compile(optimizer='adam', loss=custom_loss)

If you need to use an additional argument in your custom loss function, you can define a wrapper function that takes in the additional argument and returns the actual loss function. Here’s an example of how you can do this:

import tensorflow as tf

def custom_loss_wrapper(alpha):
    def custom_loss(y_true, y_pred):
        squared_difference = tf.square(y_true - y_pred)
        penalized_difference = squared_difference + alpha * tf.square(y_pred)
        return tf.reduce_mean(penalized_difference, axis=-1)
    return custom_loss

In this example, we define a wrapper function called custom_loss_wrapper that takes in an additional argument alpha. The wrapper function returns the actual loss function custom_loss.

The custom_loss function calculates the squared difference between y_true and y_pred, and then adds a penalty term that depends on the y_pred values multiplied by alpha. The resulting tensor is then reduced along the last axis to obtain the mean loss value.

To use this custom loss function in a TensorFlow model, you can call the wrapper function with the desired value of alpha to obtain the actual loss function, and then pass this loss function as an argument to the compile method of the model:

model.compile(optimizer='adam', loss=custom_loss_wrapper(alpha=0.1))

You can also define it using Keras backend functions. The backend is essentially the engine that powers the machine learning framework, and it provides an interface for performing tensor operations in a fast and efficient manner. For example, TensorFlow uses the C++ and CUDA languages as its backend, while Keras provides a backend that can use either TensorFlow, Theano, or CNTK as the underlying engine. Keras provides a set of backend functions that allow you to perform low-level tensor operations in a way that is independent of the underlying backend. This means that you can write code that works with any backend that Keras supports, without having to worry about the specific details of the backend implementation. In Keras, some of the backend functions include backend.dot(), backend.add(), and backend.relu(). The reason these operations are called backend functions is that they are performed at a lower level than the higher-level API that we usually use when building a model. These backend functions are not directly accessible to us when we’re working with the high-level API. Instead, we use the high-level functions like tf.keras.layers.Dense() in TensorFlow or keras.layers.Dense() in Keras, which internally make use of the backend functions to perform the computations. Backend functions are important because they provide an interface for performing low-level tensor operations in a way that is independent of the underlying backend. For example, imagine you’re building a model in Keras that uses a custom activation function that is not provided by the high-level API. You could write the activation function using NumPy, but this would limit your model to running on CPUs. If you instead use a Keras backend function to define the activation function, your model will be able to run on GPUs as well, which can greatly improve performance. As discussed, They are not directly accessible to us when we’re working with the high-level API of a machine learning framework, but they are used internally to perform computations.

The reason we might want to use backend functions is to customize or extend the functionality of the machine learning framework beyond what is provided in the high-level API. For example, we might want to define a custom activation function or a custom loss function that is not provided by the framework. Here’s an example to demonstrate the need for using backend functions when creating custom loss functions in Keras.

# Define a custom loss function using a Python function
def custom_loss(y_true, y_pred):
    # This will raise an error because we are trying to use a Python function on TensorFlow tensors
    squared_difference = (y_true - y_pred) ** 2
    return squared_difference

However, when we try to use this function to compute the loss, we get an error:

TypeError: unsupported operand type(s) for ** or pow(): ‘Tensor’ and ‘int’

The reason for this error is that the ** operator is a Python operator and cannot be applied directly to TensorFlow tensors. Instead, we need to use TensorFlow backend functions or keras.backend functions to perform mathematical operations on tensors.

from tensorflow.keras import backend as K

# Define a custom loss function using backend functions
def custom_loss(y_true, y_pred):
    squared_difference = K.square(y_true - y_pred) # Use K.square() instead of **
    return squared_difference

Let’s consider another example. Suppose we want to define a custom loss function that penalizes the model for making predictions that are too far from the true labels. One way to define this loss function is to use the tf.norm() function to compute the Euclidean distance between the true and predicted labels. However, if we try to use tf.norm() directly in the loss function, we will get a ValueError because the function expects to operate on tensors of the same shape, but y_true and y_pred have different shapes.

import tensorflow as tf

def custom_loss(y_true, y_pred):
    distance = tf.norm(y_true - y_pred) # This line will raise a ValueError
    return distance

To overcome this problem, we can use backend functions from Keras to perform the same computation as tf.norm(), but with support for tensors of different shapes. Here’s an updated example that uses the K.sqrt() and K.sum() functions to implement the same computation as tf.norm():

import tensorflow as tf
from tensorflow.keras import backend as K

def custom_loss(y_true, y_pred):
    squared_difference = K.square(y_true - y_pred)
    sum_squared_difference = K.sum(squared_difference, axis=-1)
    distance = K.sqrt(sum_squared_difference)
    return distance

In this updated version, we use K.square() to compute the element-wise square of the difference between the true and predicted labels. Then, we use K.sum() to sum up the squared differences along the last axis of the tensor. Finally, we use K.sqrt() to compute the square root of the summed squared differences, which is equivalent to the Euclidean distance between the true and predicted labels.

You can also use other TensorFlow functions to overcome the ValueError when creating custom loss functions. One such function is tf.reduce_sum(), which can be used to compute the sum of the squared differences between the true and predicted labels. Here’s an example of how to use tf.reduce_sum() to define a custom loss function:

import tensorflow as tf

def custom_loss(y_true, y_pred):
    squared_difference = tf.square(y_true - y_pred)
    sum_squared_difference = tf.reduce_sum(squared_difference, axis=-1)
    distance = tf.sqrt(sum_squared_difference)
    return distance

In this example, we use tf.square() to compute the element-wise square of the difference between the true and predicted labels. Then, we use tf.reduce_sum() to sum up the squared differences along the last axis of the tensor. Finally, we use tf.sqrt() to compute the square root of the summed squared differences, which is equivalent to the Euclidean distance between the true and predicted labels.

This approach achieves the same result as using backend functions from Keras, but using TensorFlow functions instead. However, note that using backend functions may be more efficient and easier to read, especially when dealing with more complex computations or when building more complex models.

In addition to using Keras backend functions, TensorFlow also provides another way to define custom loss functions using the tf.keras.losses.Loss class. To define a custom loss function using tf.keras.losses.Loss, you need to create a new class that inherits from tf.keras.losses.Loss and and implement two methods: __init__() and call(). The __init__() method is used to define any hyperparameters or arguments that will be used by the loss function. The call() method is used to actually compute the loss value given the true labels and predicted labels.

Here’s an example of how to define a custom loss function using tf.keras.losses.Loss:

import tensorflow as tf

class CustomLoss(tf.keras.losses.Loss):
    def __init__(self):
        super().__init__()

    def call(self, y_true, y_pred):
        return tf.reduce_mean(tf.math.square(y_pred - y_true))

# Compile the model using the custom loss function
model.compile(loss=CustomLoss(), optimizer='adam', metrics=['accuracy'])
model.fit(x_train, x_train, epochs=100)

In this example, we create a custom loss function called CustomLoss that takes the true values and predicted values as input and returns the mean squared error between them. class CustomLoss(tf.keras.losses.Loss) defines a custom loss function that inherits from the tf.keras.losses.Loss class. The CustomLoss class is a subclass of the tf.keras.losses.Loss class. This means that it inherits all of the methods and properties of the Loss class, required to be used as a loss function in a Keras model. This allows you to seamlessly integrate your custom loss function into your Keras model training pipeline.

#Custom Loss Function Visualization:
plt.plot(model.history['loss'])
plt.title('Custom Loss Function')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.show()

Bonus point: Difference between Multi-class and Multi-label classification — Multi-class classification is a classification problem where the output is one of the several classes. It is also known as single-label classification because each instance is assigned a single class. In contrast, multi-label classification is a problem where an instance can be assigned multiple classes. To illustrate the difference:

Multi-class classification example: Classifying animals into categories like “cat,” “dog,” “bird,” “fish,” etc., where each animal belongs to only one of these categories.

Multi-label classification example: Tagging images with labels such as “beach,” “sunset,” “family,” “dog,” etc., where an image can have multiple labels based on its content.

Final Note: And with this, we reach the final chapter of our journey through loss functions. I hope this series has shed light on the critical role that loss functions play in the heart of deep learning.

As you reflect on the knowledge gained, I invite you to offer multiple claps — as a token of appreciation. Remember, the quest for understanding is unending, and the world of deep learning holds countless mysteries waiting to be explored. Keep learning, may your neural networks always converge to brilliance!

Loss Functions Unraveled

Written by om pramod