All the single neurons

Where we learn that it takes one neuron to solve multilinear regressions and logistic regressions, which I thought were pretty advanced back in university

Cédric Bellet
Biffures
5 min readFeb 27, 2018

--

A neuron in machine learning looks like this if you Google it:

Neuron representations, various artists

Here is my own representation of a single neuron — a bit more wavy, unpractical when it comes to drawing networks, hopefully not as bad for education purposes.

My own representation of a single neuron

A neuron has a body, dendrons to the left connecting to that body, synapses that are receptors for input coming to the neuron; to its right, the neuron has an axon, and an axon terminal (or multiple ones) that transfer the neuron’s information to the next one. The axon terminals connect to synapses.

Messages at the synapses are numbers, that are amplified or reduced as they pass through the dendrons, in proportion to the dendrons weights, noted w, or b in the case of the bias dendron (the bias dendron is always plugged to an incoming signal equal to 1). If a weight w is equal to 0, then the dendron is as good as dead — it passes no information at all. Signals from each dendron are summed in the neuron’s body, as the weighted input z.

The weighted input z then passes to the axon where it is optionally touched by an activation function σ. The activation function can be the Identity function (~the activation function does nothing to the weighted input), the Sigmoid function, the Rectified Linear function, but really, can be any function needed for a given application. What we get after the activation function is , the neuron’s activation state.

Sample activation functions according to Wikipedia. Linear, non-linear, monotonous are not, symmetric or not… Activation functions can even be functions of not just the weighted input z, but of other parameters like weighted inputs in other neurons — see for example the softmax activation function.

From a place where I can actually use MathJax, here is what I have to say about the forward pass for our single neuron:

dot product is a scalar product, by opposition to the Hadamard product (or element-wise product) also used in machine learning

Putting single neurons to work

So far our neuron takes input and produces an output, but this is it. If the weights and bias are set in to certain values, the output can be useful — otherwise we just have a random number generator.

In order to make the neuron useful we need to train it. So we need a trainer for that neuron, that will (i) be able to evaluate the neuron’s performance, and (ii) create a learning procedure so the neuron performs better. We use a cost function to define (i), and the back-propagation algorithm to address (ii).

So what now? Well, let’s replace C and σ in the equations above with actual functions and we will start to see how those formulas are just generalizations of classic regression problems.

Single neurons solve (multi)linear regressions

The goal of linear regressions is to find, given a dependent variable and n explanatory variables, a line, expressed as a function of the explanatory variables such that the distance between the dependent variable and that line is minimal. A simple example of a linear regression is:

Here is exactly the same program, but written using the conventions we have used to describe our neuron so far:

in which we recognize a single neuron net problem, where the axon activation function σ is the identity function, and C is the squared l2 (or Euclidian) norm.

This neuron (Id activation function) with an squared l2 norm solves 3-parameter linear regressions

Single neurons solve logistic regressions

Similarly, logistic regressions can be expressed as single neuron network problems. Consider the stated goal of a logistic regression:

which is equivalent to the single neuron net program:

This neuron solves classification problems with 3 input parameters

In summary, using a single neuron, logistic vs linear regressions are simply a matter of flipping the activation and cost functions of the neuron:

A Python implementation and real regressions solved by a single neuron

Here is how the single neurons fared on four different regression exercises:

Black and white dots are inputs, blue lines are the neuron’s answers. Left: logistic regression exercises (separate the black and white dots), right: linear regerssions (find the line closest to all points). Noise was introduced to prevent the neuron from getting a perfect answer. I increased the order of the datasets to allow the neuron to find polynomial patterns.
  • The two logistic regressions were solved by a single neuron with 10 input dendrons, a sigmoid activation function and a cross entropy cost function
  • The two linear regressions were solved by single neurons with 5 to 10 input dendrons, a identity activation function and a squared l2 cost function

The core code to reproduce those results fits is reproduced below, and you can find the full code on GitHub.

Nothing too crazy, and just like that we learnt that single neurons are incredibly powerful. The next question I am asking myself is why the rectified linear activation function seems so prevalent in actual neural networks, if the Identity and the sigmoid functions are so strong. In particular, rectified linear activations cause neurons to “die” (= output 0) if their weighted input is negative — what value could possibly be gained from creating neurons that exhibit that sort of behavior? Nothing useful in the context of single neurons, but possibly something good when it comes to neural networks, as we shall learn.

Main piece of code to create the single neurons that produced the charts above. Full code at https://github.com/cedricbellet/1neuron_net.

--

--