Image by Peggy und Marco Lachmann-Anke from Pixabay

Logistic regression in PyTorch

custom functions and autograd

3 min readApr 22, 2019

Perceptron

Basically, the perceptron (neuron), the elementary unit of a neural network, is a logical computing unit which, having n input values and n weights, makes the sumation:

Simplifying the situation to only two elements of input data, our neuron became a line splitting a plane in two opposite sections, one positive and one negative.

Say, for example, we have the equation 2x+3y+4=0, which gives us the following positive sub-plane(in red color), defined by the inequation 2x+3y+4>0:

Having two category of points in the plane tagged with 0, respective 1, our single neuron has the possibility of drawing a linear boundary between this two category of points. For this task, the sumation value of the neuron must be passed to the the sigmoid function(to obtain a probability for correct classification of each point) and the weights of the neuron must be updated according to the cross-entropy error function(see backpropagation).

Having all that prerequisites stated , we define our input data. We could define it as numpy array and convert it into Tensor. Weights and bias must be defined as Variable type with option requires_grad=True to activate the automatic back-propagation feature (autograd).

Because pytorch are offering multiple choice of implementation, first we are trying the simplest implementation mode of error function following step-by-step the above statements and provide an:

Explicit definition

The same result could be obtained using:

Predefined function

Or we could define our own:

Custom function

This needs basic understanding of automatic backpropagation mechanism used by pytorch.

Reminding the chain rule:

Every pytorch math function, apart from forward procedure definition(output), have also the backpropagation procedure: the gradients of all variables with respect to the output of function. When invoking backprop(), all the functions involved are passing each other their gradients whic are multiplied together.

Computing derivatives (backpropagation) of a scalar tensor (tensor with one element — 0 dimensional) is done by simply invoking .backprop(). If the tensor is not a scalar(holds more than one element of data), we get the error message “RuntimeError: grad can be implicitly created only for scalar outputs” because the backpropagation method of the function need to receive the gradients of the previous function within the composition chain, in order to multiply them with his own and give the result to the next function in the composition chain. An example will be more suggestive:

z.backward() must receive gradients of the previous function, the sum, which are just ones (derivatives of a constants):

And one more example, for an easy insight:

Finnaly,

Adding all together:

The entire code is available at Logistic regression.

Graphical results of the above code: