# How is Pytorch’s `binary_cross_entropy_with_logits`

function related to sigmoid and `binary_cross_entropy`

Oct 16, 2018 · 2 min read

This notebook breaks down how `binary_cross_entropy_with_logits`

function (corresponding to `BCEWithLogitsLoss`

used for multi-class classification) is implemented in pytorch, and how it is related to `sigmoid`

and `binary_cross_entropy`

.

Link to notebook:

`import torch`

import torch.nn as nn

import torch.nn.functional as F

Simuated `x`

variable:

`batch_size, n_classes = 10, 4`

x = torch.randn(batch_size, n_classes)

x.shape

Out:

`torch.Size([10, 4])`

Run:

`x`

Out:

`tensor([[ 2.3611, -0.8813, -0.5006, -0.2178],`

[ 0.0419, 0.0763, -1.0457, -1.6692],

[-1.0494, 0.8111, 1.5723, 1.2315],

[ 1.3081, 0.6641, 1.1802, -0.2547],

[ 0.5292, 0.7636, 0.3692, -0.8318],

[ 0.5100, 0.9849, -1.2905, 0.2821],

[ 1.4662, 0.4550, 0.9875, 0.3143],

[-1.2121, 0.1262, 0.0598, -1.6363],

[ 0.3214, -0.8689, 0.0689, -2.5094],

[ 1.1320, -0.6824, 0.1657, -0.0687]])

Simuated `y`

variable:

`target = torch.randint(n_classes, size=(batch_size,), dtype=torch.long)`

target

Out:

`tensor([1, 1, 3, 0, 2, 0, 2, 2, 1, 2])`

Run:

`y = torch.zeros(batch_size, n_classes)`

y[range(y.shape[0]), target]=1

y

Out:

`tensor([[0., 1., 0., 0.],`

[0., 1., 0., 0.],

[0., 0., 0., 1.],

[1., 0., 0., 0.],

[0., 0., 1., 0.],

[1., 0., 0., 0.],

[0., 0., 1., 0.],

[0., 0., 1., 0.],

[0., 1., 0., 0.],

[0., 0., 1., 0.]])

`sigmoid`

+ `binary_cross_entropy`

Run:

**def** sigmoid(x): **return** (1 + (-x).exp()).reciprocal()

**def** binary_cross_entropy(input, y): **return** -(pred.log()*y + (1-y)*(1-pred).log()).mean()

pred = sigmoid(x)

loss = binary_cross_entropy(pred, y)

loss

Out:

`tensor(0.7739)`

`F.sigmoid`

+ `F.binary_cross_entropy`

The above but in pytorch:

`pred = torch.sigmoid(x)`

loss = F.binary_cross_entropy(pred, y)

loss

Out:

`tensor(0.7739)`

`F.binary_cross_entropy_with_logits`

Pytorch's single `binary_cross_entropy_with_logits`

function.

`F.binary_cross_entropy_with_logits(x, y)`

Out:

`tensor(0.7739)`

For more details on the implementation of the functions above, see here for a side by side translation of all of Pytorch’s built-in loss functions to Python and Numpy.