Basic Concepts You Should Know Before Starting with the “Neural Networks” (NN) — 3

shaistha fathima
Sep 11 · 9 min read

Hey guys! Hope you have read the previous two posts of this series, if not, please have a look at it before going ahead. Here is the link to take a look at the previous two posts of this series (Basic Concepts You Should Know Before Starting with the “Neural Networks” (NN)) Topics :

So… all this time i have been saying “Basic Concepts You Should Know Before Starting with the “Neural Networks” (NN)” but what does neural network mean and how is it made?

Before we move any further i would like you to have a basic understanding of what is a neural network ?, not too deep, just the intro…

Not too deep INTRO to Neural Network!

For this I will be answering 4 basic questions for you:

  • What is neural network?
  • Where can we use it?
  • How is it made?
  • Why should we bother using it?

What is neural network?

As defined in Investopedia :

A neural network is a series of algorithms that endeavors to recognize underlying relationships in a set of data through a process that mimics the way the human brain operates. Neural networks can adapt to changing input; so the network generates the best possible result without needing to redesign the output criteria.

To put it simply, Neural networks are a series of algorithms that mimic the operations of a human brain to recognize relationships between vast amounts of data.

“Neuron” is the fundamental unit of the neural network i.e., combination of neurons forms a neural network. Here, we are refering these neuron as “Perceptron”.

Perceptron with basic example say detecting if the student has passed or failed after processing through their marks! (many inputs for one output!)

Where can we use it?

Neural networks are used in deep learning to model complex patterns and prediction problems. Simplest examples are handwriting letter prediction, object detection, etc.

Neural networks are broadly used, with applications for financial operations, enterprise planning, trading, business analytics and product maintenance. Neural networks have also gained widespread adoption in business applications such as forecasting and marketing research solutions, fraud detection and risk assessment.

How is it made?

Neural networks are composed of layers of computational units called neurons (Perceptrons), with connections in different layers. These networks transform data until they can classify it as an output.

This example uses 2 layers to process the data and get the desired output.

Each balls in the hidden layer represents a perceptron!

Each neuron multiplies an initial value by some weight, sums results with other values coming into the same neuron, adjusts the resulting number by the neuron’s bias, and then normalizes the output with an activation function.

For now you may just remember that Neural Networks are the means of doing machine learning, in which a computer learns to perform some task by analyzing training examples. Usually, the examples have been labeled in advance. An object recognition system, for instance, might be fed thousands of labeled images of cars, houses, coffee cups, and so on, and it would find visual patterns in the images that consistently correlate with particular labels.

So a key feature is an iterative learning process in which records (rows) are presented to the network one at a time, and the weights associated with the input values are adjusted each time. After all, cases are presented, the process is often repeated. During this learning phase, the network trains by adjusting the weights to predict the correct class label of input samples.

Why should we bother using it?

With the amount of computational power needed by the company almost doubled every 2 years, many companies now use powerful hardware's to perform tasks with large data-sets in just a little amount of time using deep learning! Nvidia is the market leader when it comes to GPUs and now Google has come into this space as well, releasing TPUs(specialized ASICs for Deep Learning) in recent times, making it easier to train Deep Neural networks, requiring so much computational power.

We use a Neural network especially when we have a lot of data to process requiring high computational power and when accuracy matters the most to you. For Example, Cancer Detection. You cannot mess around with accuracy here if you want this to be used in actual medical applications.

The main advantage of Neural Network lies in their ability to outperform nearly every traditional Machine Learning algorithms, but just like everything else, they too have their disadvantages. Read this post by Rahul Bhatia on When not to use Neural Networks.

The only think to keep in mind while using a neural network is, the more data you give it to process the better the accuracy and result will be!

Perceptron

Like I have said before, perceptrons are what we call the neurons of the neural networks, meaning they are like the building blocks of the neural networks! Sometimes, also referred to as the Simplest form of Neural Network or Single Neural Network.

A perceptron takes several binary inputs, x1,x2,x3,…, and produces a single binary output:

In this example, the perceptron has three inputs, x1,x2,x3. In general it could have more or fewer inputs.

From the below image of basic perceptron structure:

Perceptron is a processing unit taking in input values and giving out the output after processing it. It helps in decision boundary making and finding the line! (from the previous example in post 2 on number of students pass based on gradient descent!)And yes, perceptron also uses the concept of gradient descent!

In this context, the perceptron follows these steps:

  1. Multiply all the inputs by their weights w, real numbers that express how important the corresponding inputs are to the output.
  2. Add them together referred as weighted sum: ∑ wj xj. ( W.x in the below equation)
  3. Apply the activation function, in other words, determine whether the weighted sum is greater than a threshold value, where -threshold is equivalent to bias, and assign 1 or 0 as an output.

We can also write the perceptron function in the following terms:

Notes: b is the bias and is equivalent to -threshold, w.x is the dot product of w, a vector which component is the weights, and x, a vector consisting of the inputs.

From the above equation and process you might have a few questions, lets answer each of them one by one…

Why should you find weighted sum?

As said above one perceptron is used to find out one definite result, though it may take in many input values in the end, it just processed those values and finds out the relevant output. To help get a result, here a Boolean value of 1 or 0, summation of all the dot products of weights and inputs was introduced. You will understand it better with the later example.

What is an activation function and why must we use it?

Activation Function decides, whether a neuron should be activated or not by calculating weighted sum and further adding bias with it. The purpose of the activation function is to introduce non-linearity into the output of a neuron.

In a neural network, we would update the weights and biases of the neurons on the basis of the error at the output. This process is known as back-propagation. Activation functions make the back-propagation possible since the gradients are supplied along with the error to update the weights and biases.

Why do we need Non-linear activation functions ?
A neural network without an activation function is essentially just a linear regression model. The activation function does the non-linear transformation to the input making it capable to learn and perform more complex tasks.

There are many types of activation functions which we will discuss in the next post, but, if you are interested you might check these links here:

Example to understand the perceptron function:

One of the examples quoted from Neural Networks and Deep Learning

Let me give an example. It’s not a very realistic example, but it’s easy to understand, and we’ll soon get to more realistic examples. Suppose the weekend is coming up, and you’ve heard that there’s going to be a cheese festival in your city. You like cheese, and are trying to decide whether or not to go to the festival. You might make your decision by weighing up three factors:

Is the weather good?

Does your boyfriend or girlfriend want to accompany you?

Is the festival near public transit? (You don’t own a car).

We can represent these three factors by corresponding binary variables x1,x2, and x3. For instance, we’d have x1=1 if the weather is good, and x1=0 if the weather is bad. Similarly, x2=1 if your boyfriend or girlfriend wants to go, and x2=0 if not. And similarly again for x3 and public transit.

Now, suppose you absolutely adore cheese, so much so that you’re happy to go to the festival even if your boyfriend or girlfriend is uninterested and the festival is hard to get to. But perhaps you really loathe bad weather, and there’s no way you’d go to the festival if the weather is bad. You can use perceptrons to model this kind of decision-making.

One way to do this is to choose a weight w1=6 for the weather, and w2=2 and w3=2 for the other conditions. The larger value of w1 indicates that the weather matters a lot to you, much more than whether your boyfriend or girlfriend joins you, or the nearness of public transit. Finally, suppose you choose a threshold of 5 for the perceptron. With these choices, the perceptron implements the desired decision-making model, outputting 1 whenever the weather is good, and 0 whenever the weather is bad. It makes no difference to the output whether your boyfriend or girlfriend wants to go, or whether public transit is nearby.

By varying the weights and the threshold, we can get different models of decision-making. For example, suppose we instead chose a threshold of 3. Then the perceptron would decide that you should go to the festival whenever the weather was good or when both the festival was near public transit and your boyfriend or girlfriend was willing to join you. In other words, it’d be a different model of decision-making. Dropping the threshold means you’re more willing to go to the festival.

Obviously, the perceptron isn’t a complete model of human decision-making! But what the example illustrates is how a perceptron can weigh up different kinds of evidence in order to make decisions. And it should seem plausible that a complex network of perceptrons could make quite subtle decisions:

In this network, the first column of perceptrons — what we’ll call the first layer of perceptrons — is making three very simple decisions, by weighing the input evidence. What about the perceptrons in the second layer? Each of those perceptrons is making a decision by weighing up the results from the first layer of decision-making. In this way a perceptron in the second layer can make a decision at a more complex and more abstract level than perceptrons in the first layer. And even more complex decisions can be made by the perceptron in the third layer. In this way, a many-layer network of perceptrons can engage in sophisticated decision making.

Conclusion

Hope with the above example you could understand the importance and use of weights and bias. In the next post we will dig deeper into perceptron and try to understand it better! You are always free to ping me here or reach out at twitter for any queries.

Resources you might want to refer to:

You may also check my other series on Introduction to “Tensors” if interested!

Till then stay tuned and happy learning!

shaistha fathima

Written by

AI and Cyber Security Enthusiast | Web Developer | Pytorch Scholar

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade