How is Pytorch’s binary_cross_entropy_with_logits function related to sigmoid and binary_cross_entropy

Yang Zhang
Oct 16, 2018 · 2 min read

This notebook breaks down how binary_cross_entropy_with_logits function (corresponding to BCEWithLogitsLoss used for multi-class classification) is implemented in pytorch, and how it is related to sigmoid and binary_cross_entropy.

Link to notebook:

import torch
import torch.nn as nn
import torch.nn.functional as F

Simuated x variable:

batch_size, n_classes = 10, 4
x = torch.randn(batch_size, n_classes)
x.shape

Out:

torch.Size([10, 4])

Run:

x

Out:

tensor([[ 2.3611, -0.8813, -0.5006, -0.2178],
[ 0.0419, 0.0763, -1.0457, -1.6692],
[-1.0494, 0.8111, 1.5723, 1.2315],
[ 1.3081, 0.6641, 1.1802, -0.2547],
[ 0.5292, 0.7636, 0.3692, -0.8318],
[ 0.5100, 0.9849, -1.2905, 0.2821],
[ 1.4662, 0.4550, 0.9875, 0.3143],
[-1.2121, 0.1262, 0.0598, -1.6363],
[ 0.3214, -0.8689, 0.0689, -2.5094],
[ 1.1320, -0.6824, 0.1657, -0.0687]])

Simuated y variable:

target = torch.randint(n_classes, size=(batch_size,), dtype=torch.long)
target

Out:

tensor([1, 1, 3, 0, 2, 0, 2, 2, 1, 2])

Run:

y = torch.zeros(batch_size, n_classes)
y[range(y.shape[0]), target]=1
y

Out:

tensor([[0., 1., 0., 0.],
[0., 1., 0., 0.],
[0., 0., 0., 1.],
[1., 0., 0., 0.],
[0., 0., 1., 0.],
[1., 0., 0., 0.],
[0., 0., 1., 0.],
[0., 0., 1., 0.],
[0., 1., 0., 0.],
[0., 0., 1., 0.]])

sigmoid + binary_cross_entropy

Run:

def sigmoid(x): return (1 + (-x).exp()).reciprocal()
def binary_cross_entropy(input, y): return -(pred.log()*y + (1-y)*(1-pred).log()).mean()

pred = sigmoid(x)
loss = binary_cross_entropy(pred, y)
loss

Out:

tensor(0.7739)

F.sigmoid + F.binary_cross_entropy

The above but in pytorch:

pred = torch.sigmoid(x)
loss = F.binary_cross_entropy(pred, y)
loss

Out:

tensor(0.7739)

F.binary_cross_entropy_with_logits

Pytorch's single binary_cross_entropy_with_logits function.

F.binary_cross_entropy_with_logits(x, y)

Out:

tensor(0.7739)

For more details on the implementation of the functions above, see here for a side by side translation of all of Pytorch’s built-in loss functions to Python and Numpy.

Yang Zhang

Written by

Data Scientist at Salesforce Commerce Cloud Einstein

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade