Handling Class Imbalance by Introducing Sample Weighting in the Loss Function

What is the Class Imbalance Problem?

The Class Imbalance problem is a problem that plagues most of the Machine Learning/Deep Learning Classification problems. It occurs when there are one or more classes (majority classes) that are more frequent occurring than the other classes (minority classes). Simply put, there is a skewness towards the majority class.

Why is Class Imbalance a Problem?

So far, we have looked and understood what is the Class Imbalance Problem. But why is it a problem? What is the need to overcome this problem?

Why not simply ReSample the data differently?

One of most prominent methods for handling Class Imbalance in a dataset is to perform Undersampling for the Majority Classes or Oversampling for the minority classes.

Sample Weighting in Loss Function

Introducing Sample Weights in the Loss Function is a pretty simple and neat technique for handling Class Imbalance in your training dataset. The idea is to weigh the loss computed for different samples differently based on whether they belong to the majority or the minority classes. We essentially want to assign a higher weight to the loss encountered by the samples associated with minor classes.

Inverse of Number of Samples (INS)

As the name suggests, we weight the samples as the inverse of the class frequency for the class they belong to.

Inverse of Square Root of Number of Samples (ISNS)

Here we weight the samples as the inverse of the Square Root of class frequency for the class they belong to.

Effective Number of Samples (ENS)

This weighting scheme was introduced in the CVPR’19 paper by Google: Class-Balanced Loss Based on Effective Number of Samples. As seen in the weighting schemes above, the re-weighting strategies rely on the total number of samples present in each class. This paper on the other hand introduces a weighting scheme that relies on the “Effective Number of Samples”. As described in the paper the authors argue that:

“as the number of samples increases, the additional benefit of a newly added data point will diminish. We introduce a novel theoretical framework to measure data overlap by associating with each sample a small neighboring region rather than a single point. The effective number of samples is defined as the volume of samples and can be calculated by a simple formula (1−β^n)/(1−β), where n is the number of samples and β ∈ [0, 1) is a hyperparameter”

Implementation Source


In this blog, we read about the Class Imbalance problem and how it can adversely affect a model’s learning. We then saw how the simple resampling techniques such as Oversampling or Undersampling can only make the existing problem worse by either overfitting or by missing out on learning important concepts. We finally explored different weighting schemes and how we can apply them to solve the Class Imbalance issue.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store