Neural Networks For Dummies

Jeremi Nuer
7 min readOct 7, 2021

Neural Networks explained in simple terms so that anyone can understand it (including an idiot like myself)

*This is us putting our thinking caps on to learn this concept

Let me guess. You’re at your wits end. The frustration has been building up for the past few hours, and soon it will turn into resignation. All this neural network stuff sounds good in the moment, but after reflecting on it for a minute, you come to the incredibly astute observation that none of it makes any f***ing sense.

Well if that isn’t you, I’m slightly annoyed, and my ego is miffed. Because it sure as heck was me! Everyone throwing around big terms like “hidden layers” or “activation functions,” and then giving half assed explanations that leave you more confused than you started. Who do they think we are, Einstein?? Well the truth is, the topic of Neural Networks is not that complicated, and it’s not that hard to learn. Let me prove it to you.

So, What Even Are Neural Networks?

Put simply, a Neural Network is a type of Machine Learning that can take in data, and make accurate predictions based on that data. So for example, if a Neural Network were given an image of a bird, it would be able to predict ‘hey, that’s actually a bird’.

Funnily enough, this is pretty accurate to what a computer does.

Ok, it’s all good and well to say that Neural Networks can do this, but more importantly, how can they do it? Well before we can understand that, we need to understand a few key words that are instrumental to how these things work

  • Input Layer: This is the ‘layer’ or step where the data is inputted into the Neural Network.
  • Hidden Layer: These layers are where the process of interpreting the data happens. The data is filtered, and becomes more abstract.
  • Output Layer: This layer is the output, or interpretation. In the case of an image classification of a bird, this is where the computer says “Yep, that’s a pigeon.”
  • Node: These are the things that make up the layers. Each node is a filter that takes in input (the data from the previous layer) and is responsible for recognizing a specific pattern, outputting whether or not it sees the pattern.
  • Activation Function: This is the function for the node, that takes all the inputs from the previous layer, and compresses into the weight (a decimal between 0 and 1)
  • Weight: This is the output of each activation function, a number between 1 and 0 that represents the pattern the activation function was looking for.
  • Parameter Value: Decides how high the weight needs to be for the information to pass to the next hidden layer.

*Note that the input layer has “input nodes” which represent the raw data, rather than the nodes in the the hidden layer which contain more abstract patterns

Now, let’s look at how all these things fit together.

Note that each circle represents a node, and each column of nodes represents a layer

Let’s look at this process step by step. First, we have the input layer. Each node, or ‘input node’ represents a different little piece of raw data that makes up your input. For example, the input nodes of an image of a bird would be the pixels. For our purposes, we’re going to assume that there is one input node for every single pixel in the image (sometimes the image is compressed). Only the pixels that have color will pass information to the next layer. So, the first hidden layer will only receive information from the pixels that are actually active.

“What the heck even is this information,” I hear you ask? No, just my imagination? Well I’m gonna explain it anyways! The information, is the weight. The weight being a decimal number between 0–1. If we think of a black and white image, the 0 would be black, and the 1 would be white, with the decimals being differing colors of grey.

So, the first hidden layer received the input in form of a bunch of numbers between 0–1 (weights), each representing a separate pixel. Now, what does each node do with this information? Well, it inputs the number into the activation function, and spits out the output to each node in the next hidden layer!

Huh?

Yes, they did spell function wrong, what can you do?

It’s helpful to think of a node as a filter, or a pattern recognizer. It takes in data from the nodes of the previous layer, and outputs a slightly more abstract pattern. Let’s say this specific node is trying to find vertical edges. It would take the weights of all the nodes on the previous layer, and put it into the activation function. This activation function is basically asking “Do we see any vertical edges from this data? The answer is given in the form of a weight, or a decimal number between 0 and 1. So, a 0 would mean no, we don’t see a vertical edge and a 1 would mean yes, we do see a vertical edge!, with the decimals representing how much it resembles a vertical edge. Suddenly, the output of this node is not just a number, it represents an abstract concept.

I want to stress that most of the time, the activation function spits out a decimal, not 0 or 1. The higher the decimal, the more the computer thinks that the pattern is there. The parameter values determine how high the decimal needs to be in order for it to pass to the next layer.

Remember that a node is a filter. If there isn’t a certain pattern picked up on, the node will not send information to the next layer. This is incredibly important because the absence of information is information in of itself for the next layer of nodes.

Let’s think about what the second hidden layer might look like when trying to identify a bird. These nodes are trying to pick up on more abstract patterns, maybe like body parts. Let’s say one node is trying to detect a head. When it gets all the input from the previous node, it goes through the activation function, which sees ‘ok, there was some vertical edges here, and some horizontal edges here, but there weren’t any here, so yup, I think that’s a head!’ It then outputs a weight, between 1 and 0. Decimals closer to 1 mean the node was pretty confident that there is indeed a head, and decimals closer to zero mean the node thought that the likelihood of there being a head was low.

Each layer, the patterns get more complex and abstract, until eventually you reach your output, and the computer would either say ‘yes this is a bird’ or ‘no this isn’t a bird.’ Or in other cases of image classification, there might be more than two possible outcomes. Maybe there are any range of possible animals, and the computer is trying to determine which class the image belongs to. Regardless, the process remains the same.

A visual of the data traveling through the hidden layers, and arriving at an output of classification

In a general sense, you can think of neural networks as a pathway for raw data, or input to travel through. At each layer, the data is filtered, and what is passed to the next layer is a slightly more abstract version of the data. At each layer, the information becomes more and more abstract, until you reach the final classification.

Of course, you can’t expect a neural network to do this perfectly from the start. You can’t even really expect the neural network to do this at all from the start. At first, the parameter values are set to completely random numbers, and the computer will spit out random answers at the output layer. But each time it’s wrong, it readjusts the parameter values, and slowly gets more and more accurate.

The funny thing is, although I talked about horizontal and vertical edges as examples for patterns that nodes might pick up on, you don’t actually know what pattern that the computer saw. Hence the name, “Hidden layers.” What the computer actually sees and interprets is hidden.

Here’s a good video that explains this, but feel free to skip as it isn’t necessary

https://youtu.be/R9OHn5ZF4Uo

Wrapping your head around all of this can be pretty challenging. Don’t get discouraged if it doesn’t make complete sense. There is only one cure to that; time and effort! If there’s one thing I want you to understand about Neural networks, is that it isn’t magic. It’s just math, and mathematical functions, applied very cleverly. Don’t be intimidated by all of it, and remember, you got this.

--

--

Jeremi Nuer

What does the future hold? I’m exploring emerging technologies such as AI and Quantum Computing