Udacity’s Deep Learning Foundations: What I learned in Project One

You could say, as a child, Artificial Intelligence was my first love. It prevailed in all of my favourite stories from Asimov’s Robot series to the epic scale of Frank Herbert’s Dune. If you are familiar with those works, you will know each author had an opposing vision of Artificial Intelligence and the ultimate impact it would have on human civilization. But I was a kid and the idea of building a robot was just awesome. When I first started learning to code, first in C and then Java, the intention in the back of my mind was always the penultimate robot that lived in my childhood imagination.

Needless to say, when I first saw the announcement that Udacity was going to offer a “Deep Learning Foundations” Nanodegree I had a “you had me at hello” moment and immediately registered.

I am proud to say that I have succesfully completed Project One of this Nanodegree (entitled “Your First Neural Network”). There were moments when I thought I wouldn’t. But passion is a strong driver.

Having been through the experience, and having thought for some time I should start a blog, I thought sharing what I have learned through the first project would be a good place to start.

To quickly summarize, the goal of this first project was to create a neural network to output predictions of bike sharing usage. Udacity provided the dataset as well as some starter code; the student’s job was to code a neural network that would output predictions based on the provided data.

Neural networks are fundamental to deep learning and to solving problems using deep learning. In essence, there are three essential components to a neural network; the input, the hidden layer and the output.

The picture above is an example of a neural network. The input, or the data provided, is the start. You feed data into the neural network through the input layer. From here, the data works its way through the hidden layer (in the picture above, there are two hidden layers — but there could be a single layer or many). The hidden layer will pass its results to the output layer where the result is displayed to the end user.

The hidden layer is where the magic happens. As the information from the input(s) are passed through the hidden layer(s) it is given a weight that will determine its influence on the final output. Without getting too technical, the weights applied to the input in the hidden layer are adjusted in order to reduce error using a mathematical technique called Gradient Descent. Its not too complex, but not really what I want to focus on here. Where it gets interesting is that the resulting error is actually passed back into the hidden layer when it reaches the output in order to adjust the weight of each input to the network in order to reduce the error and provide more accurate results.

This is where the machine “learning” comes into effect; the program is not provided any guidance and can use the weights on the input(s), the error and Gradient Descent in order to decide on its own how much weight it should give an input in calculating the final output. The system runs through this many, many times in order to learn from each run (by passing the error from the output back through the hidden layer) to reduce the error to near zero and provide accurate results.

The power of the hidden layer does rely on the quality of the input. The ultimate output is really a reflection of the quality of the data that is fed into the network. The hidden layer calculates error(s) using the input data and will apply the weights accordingly; however if the data is flawed from the start, so will be the output.

A great example of this is the recent failed chatbot experiment from Microsoft. Users, unfortunately, chose to feed it words, phrases, etc… which held negative connotations. It was no surprise then that the chatbot’s behaviour reflected this and had to be shutdown. Much like a child, a neural network can only grow and learn based on what it is taught — or rather, the input that it is given.

If Artifical Intelligence and Machine Learning are going to have a positive impact on society, the integrity of the data we feed into neural networks is going to be the single most important factor.

This may be a simplistic assessment, but it was also a simple neural network and only the first project. I find this subject to be fascinating and I look forward to continuing to learning — and, perhaps, sharing what I learn.