Neural Networks — As Simple as They Get

This article is a simplified way to explain neural networks to people — me, being one of them — who have nothing to do with how the technology works but do care about the impact it will have.

Published in

The Startup

9 min readNov 17, 2019

When I was trying to understand neural networks, it turned out to be super complicated for me. This was even though I am from a technical background, and have studied physics, maths and computer sciences for a long time during high school and in my bachelor’s degree. But, my curiosity kept poking me in the belly to pursue the tech enough to let it all sink in. After reading about it for quite some time now, I think there can be a less technical way to understand it for us science noobs.

Which essentially means, throughout this article, I will use an analogy or two to help you understand the underlying technology as easily as possible. So please — take it with a grain of salt.

To kick it off, let’s look at the textbook definition of neural networks or — in this case — as the one that Google search found for me.

It’s a computer system modelled on the human brain or nervous system.

Simply put, just as our brain consists of a network of neurons, neural networks consist of a network of nodes that interact with each other.

What are these nodes and how do they interact is a question that I will address later in this article. But first, let’s look at a rather fundamental problem:

How can you model something on a human brain — the working of which is still not clear to us?

Yes, we understand what parts of the cortex are undergoing activity when making certain decisions, or experiencing feelings such as love, joy, distress, etc. But, we still aren’t clear on what makes us choose one thing over others. How do we decide what’s wrong or right or make comparisons between different or similar things?

Credits

We just choose something (or other) by prioritizing, which can be based on several parameters and these vary from person to person. The important part here is all of these parameters are taken into account and a decision is made without us knowing the underlying process behind it.

Per se the ability of the human brain to make sense of things. If I see trees or scenery, I might have a certain feeling about it. I might feel happy about the view or might go under the shade of the tree if it’s too sunny. But when we think about how our brain does this, even the best of the neuroscientists will find it hard to answer.

But, since the neural networks use the human brain model as the foundation, we need to focus on the common ground. Which is — feed raw data and get a meaningful answer. A meaningful answer, in this case, is what we expect the neural network to produce.

A two-layer feed-forward representation of a neural network

Similar to the neurons in the brain, in a neural network, there’s an array of nodes in single or multiple layers. The nodes enable machine learning algorithms to draw meaningful conclusions from the raw data. As more data is fed in, the system should get smarter and thus — “learn”.

But, what are these nodes? They are the computational unit in a neural network and interact with other nodes via one or more connections. They signal the preceding or succeeding rows via other nodes. The neural network enables a machine to learn by iterations and hence are important to any artificial intelligence systems. It’s like my friends repeatedly schooling me to do the right thing, consequently, I would know what is putting them off and what is acceptable.

But, what are these nodes? They are the computational unit in a neural network and interact with other nodes via one or more connections. They signal the preceding or succeeding rows via other nodes. The neural network enables a machine to learn by iterations and hence are important to any artificial intelligence system. It’s like my friends repeatedly schooling me to do the right thing, consequently, I would know what is putting them off and what is acceptable.

When “learning” a neural network has to know when it is making a correct decision and when — not. The correct decision is measured via the cost function, but that’s a topic that I will cover in another article.

What’s the correct decision? The one that we or the programmer wants it to make. For example, when the leaves turn red we know that the autumn has begun, how do we make a neural network recognize the same thing?

Then there are different neural networks from a simple handwriting recognition system to complex voice recognition neural networks or an image recognition system.

But, what’s the difficulty here? Why don’t we store all the languages on the system and then plug it into a camera and let the system compare the input to recognize — for instance, letters? This will only work until you want it to recognize handwriting. Beyond that, it becomes tricky because you can not store all the different handwritings of every individual.

Our brain, on the other hand, is capable of recognizing different handwritings or different accents in whatever language we speak and know.

The way out is to let the system train on its own and figure out which letter is which. Even though they are written differently by different individuals. But where do we start?

Let’s try to understand the working of neural networks with an example of a kindergarten class. To be precise, an analogy to understand the multilayer neural network.

Imagine a class of kindergarten students — in this case — a very talkative bunch. And, guess what, we want this group to be as talkative as it can be. The kids are sitting in four rows, the first row being the input row, where the teacher gives a certain hand-drawn geometrical shape — or the raw data.

Representation of neural network via a classroom

The middle two rows are the hidden rows and are separated via a curtain from the teacher. However, the hidden row can interact with the input layer and the output layer on either side.

Each student or row of students is equipped with ceratin instruction to respond over certain types of geometrical shapes. For example, let’s say the first row is told to accept any kind of geometric shape.

Any particular student in the first row doesn’t know the name of this geometric shape. However, it knows a node or another student in the hidden layer that might know which shape it is, thanks to the cost function. This student(node), since it can talk to every student in the hidden row(layer) passes this geometric shape to every node that it thinks might know the shape.

Next up, comes the communication part. Where each student lets the preceding or the succeeding students know what they think might be the right student to send the input to, to guess the right shape, which will be done by the last row(layer) — based on the input received.

It essentially means it might or might not guess the correct shape.

Credits

This is done via the loudness of the greyscale value (any value between 0 and 1, e.g 0.1, 0.2, etc) in case of an image recognition system. Where 0 mean least likely and 1 means highly likely for the students(nodes) to allocate the next student. In the case of our little class let’s image they use “Hmm” to let this known to their counterparts. Which means, “Hm” (0) would be highly unlikely and “Hm…(x10)…mm” (1) is highly likely.

In short, all the nodes are constantly exchanging these values with each other and hence informing each other what a particular row(layer) of student thinks. Based on which the next nodes make a decision and give feedback in return.

Let’s move forward. The geometric shape has now been passed to the hidden rows of students. The students in the hidden rows aren’t any smarter either, however, here every student is equipped is recognize certain parts in a geometric shape. Some know how to identify lines, some know how to identify curves, some pointy ends.

In the second hidden row students know how to stitch together the information they got from the preceding row and then pass on the information to the final row. The final row compares the information with the shapes that it has and guesses the names of the shape. Depending on what they guess and what the actual shape is they are rewarded with the cost function values. Let’s say in case of our kindergarten class the teacher claps for the right answers and nods in case of a wrong answer.

Now that we have established the basic working of the neural network model, let’s try and understand how the feedback chain works. This means, how the nodes let each other know that this might be the shape that it is?

So let’s start one more time, the teacher gives the shapes to the first layer. The first layer based on the grayscale value or the intensity of their “hmm…s” passes it to certain nodes that might recognize the shape.

The important thing to note here is not even the engineers need to know what’s going on the hidden layer, but they do know that certain nodes will be fire-up if a certain kind of shape is fed. Moreover, the cost function feedback helps the system guess the right answer. The system over many iterations improves itself.

What the engineers can do is tweak the connections between nodes, or they — simply put — can change the seating order of students that they think can make the system more efficient.

The guiding light(s) in all of this are the different machine learning algorithms. It’s like the kind of “hmm-hmm” song the neural network is designed to sing and every song should lead to a pre-expected result.

A few machine learning (ML) algorithms to name are Logistic Regression and the Back Propagation Neural Network which falls in the category of supervised ML algorithm; Apriori algorithm which is an unsupervised ML algorithm and then there are some mixed forms of such algorithms.

Neural Networks are designed to learn, there’s no pre-determined way or code, although there’s an algorithm that assists in finding the desired answer.

If you are wondering how this classroom of these little kindergarten students looks like see below the lastest neural network chip from Intel called the Nervana NNP-T. Other chips that are using neural networks are the Apple A12 Bionic chip, the Huawei Kirin 980.

Hey there, thanks for making it to the very end! Based on the response to this article, I will also try to explain more aspects of neural networks such as cost functions, the greyscale values, and the “learning” part. Also, I would like to understand what is the best approach to explain neural networks or such technologies? Should I get more technical or should I make something like I tried to create in this article? Did I miss something? or should I explain somethings in more detail? Let me know in the comments, until next time. Cheers.

Neural Networks — As Simple as They Get

This article is a simplified way to explain neural networks to people — me, being one of them — who have nothing to do with how the technology works but do care about the impact it will have.

Written by Abhijeet Singh