# Derivation of the Binary Cross-Entropy Classification Loss Function

## Derive the log loss function used in machine learning tasks

*This article demonstrates how to derive the **cross-entropy log loss function** used in machine learning binary classification problems.*

The loss function is minimised using *gradient descent*, and network weights are updated through *backpropagation*.

The cross-entropy loss function is a composite function. Therefore, this article also demonstrates how to use the **chain rule** to find the *partial derivatives* of a **composite function**.

A *composite *is formed of one or more functions, as shown in Equation 1. Two functions, *f *and *g,* comprise y.

Equation 2 is another composite, *L(a, y)**. *There are two variables, *a *and *y.*

*y* is a **constant**, while *a *is another function dependent on *z, *as shown by Equation 3*.*

Note that ** e** is not a variable; it is

**Euler’s number**, a

*transcendental constant*approximately equal to

*2.71828*.

Furthermore,* z* is a **function** of *w*, *x* and *b* as defined by Equation 4.