Artificial Neural Networks for beginners : Part 1 : Concepts

Image from representing a simple neural network architecture

Artificial Neural Networks (ANNs) can be a mind boggling topic in machine learning, especially when a person has to convert the architecture itself into code. In this tutorial series we will go through the mechanical concept revolving around Neural Network in terms of biology and computation, and then we will go through the code implementation in Python. In part 1 of this tutorial we go through the mechanism of simple Neural Networks consisting of few neurons feature variable and Hebb’s rule.

For this tutorial no knowledge of neurosciences is required but a little programming intuition would make the concepts easier to grasp.

What is Neural Network ?

As the name suggests, Neural Networks are information architecture inspired by our brain’s communication networks. A human brain on average contains about 21 million neurons and these neurons acquire, store, and retrieve day-to-day learning and memories. When a learning process occurs, it stores the results in the form of new neural connections and when we recall that memory or skill from storage, the result is simultaneous neural firings of input electric signals. The signals travel to the target neuron in order to activate that memory.

In the above figure, X1 to X3 represents our input neuron (feature), where in order for the output node to be activated (in our case memory recall), all three inputs (X) need to fire simultaneously.

Each of X variable’s connection carries a “weight” with it — think of it as the importance of that variable X. In machine learning, the weight can also be called the feature importance. Think of the weights as the measurement of the electrical impulse and the potency of the voltage defines how important the variable X is in playing a role for the output node (prediction). In Artificial Neural Networks (ANN) the weights are redefined and updated after every training session. Neural Networks are effective in measuring feature interactions and hidden patterns thus suitable for unsupervised learning.

In Artificial Neural Network the above phenomenon can be described by Hebb’s rule as a method of determining how to alter the weights between model neurons. The weight between two neurons increases if the two neurons activate simultaneously, and reduces if they activate separately. Nodes that tend to be either both positive or both negative at the same time have strong positive weights, while those that tend to be opposite have strong negative weights.

or as phrased by Siegrid Löwel:

“Cells that fire together, wire together”

The output node carries an aggregated sum of all the interactions that go through the weight filter. Please keep in mind this is a simplified neural learning model; complex computational ANNs can consist of thousands of nodes with hundreds of features with million connections and interactions.

This marks the end of Part 1, where we talked about how signals from input variable X propagates and if the “intensity” or “weight” of their signal plays significant outcome in output neuron (prediction), then the weight positively increases so does the feature importance.