Deep Learning — a gentle dive
I recently did a 20 minute talk on Deep Learning at a developer group in Johannesburg. Since most of my slides were visual in nature I decided to turn the talk into a blog post rather than sharing the slides.
Hopefully I can use the blog post to show you that if you ignore all the hype, that you can do some truly amazing things with Deep Learning.
To start, I’m going to try and tickle your fancy by showing you a few examples of what Deep Learning has achieved.
Lee Sedol beaten by AlphaGo in the ancient Chinese board game Go. For humanity it was painful to watch, for AI it was a huge leap forward.
Deep Learning used in self driving cars. It is estimated that Tesla vehicles has done over 2 billion kilometres in autopilot mode.
Deep Learning used for image annotation.
Real time translation — crossing the language barrier.
What is Deep Learning
According to François Chollet:
“deep learning so far has been the ability to map space X to space Y using a continuous geometric transform, given large amounts of human-annotated data.
In other words, if you use the diagram below as a guide, deep learning is the ability to map a class of samples (space X) to a an output category (class Y) by using geometric transformation.
“Doing this well is a game-changer for essentially every industry, but it is still a very long way from human-level AI.”
Deep Learning has the ability to disrupt, and to disrupt fast. Even though not nearly at the level of human intelligence there are tools and algorithms mature enough to add value in almost any industry.
Deep Learning as a Neural Network
We can also think of Deep Learning as a Neural Network with more than one hidden layer. A Neural Network with more than one hidden layer is also said to be deep.
More intuitive explanation
Let’s imagine Deep Learning as a black box, and we are standing on top of the mountain where our error is high, trying to get to the bottom where the error is lowest and the black box is good. The process of going down the mountain is the learning part. The idea is to go down as quickly and smoothly as possible.
We do this by repetitively passing training samples through the black box and calculate the output error against the desired output. With every iteration working our way back through the network we calculate the slope of each neuron, and adjust the weights in an optimised way to reduce the error (go down the mountain quickly and smoothly). If we continuously do this, we are learning.
The technique is called, backpropagation, and has had a big impact on the success of Deep Learning.
The era of MOOCs
One factor that has opened up Deep Learning to everyone is the era of MOOCs.
Through MOOCs you can learn the latest in cutting edge technology at university, including Deep Learning.
Collectively we have been resisting evil corporations controlling software for a long time through the open source movement, and MOOCs is the result of a collective resistance to the control of education by universities.
PS. I’m using the word evil in the nicest possible way.
The Stanford Experiment
An experiment ran by the Stanford university played a significant role in the launch of the MOOCs era.
Stanford is picky, less that 5% of applicants to Stanford are successful, it is only for the elite.
In 2012 Stanford ran an experiment, they decided to teach 3 computer science courses online, open to all, for free. Video lectures were posted online, forums provided a discussion and collaboration platform and tests were automated.
160000 students enrolled, 75% of those outside of the US from a 190 countries. A 100 volunteers translated the class material into 44 languages.
That is 160000 students getting a Stanford education with nothing more than an internet connection and a thirst for knowledge.
Where things gets interesting is when you look at the class grading after the course. The first Stanford student in the class came in at number 161, beaten by 160 students from mostly poor, isolated developing countries. Students that never before would have been able to attend a prestigious university like Stanford.
Largely thanks to the experiment by Stanford we now have quality MOOCs in a wide variety of science & tech, including Deep Learning.
The fast.ai MOOC by Jeremy Howard and Rachel Thomas is the leader of the pack — their different approach means you will have a Deep Learning model up-and-running in no time.
The contributions made by top experts
A second factor contributing to the democratisation of Deep Learning is the contribution the top experts are making.
Fei-Fei is the creator of ImageNet and the ImageNet competition that has contributed in making Deep Learning popular.
Fei-Fei is active in the community and Today she is the Chief Scientist AI/ML at Google Cloud.
Here Andrej Karpathy is seen giving a lecture on Convolutional Neural Networks in 2016. He is a regular blogger and often shares insights through Twitter.
He has recently been appointed as the Director of Autopilot Vision at Tesla.
Yann LeCun using the Quora platform for a Deep Learning discussion. The Quora platform is a great place to ask and answer Deep Learning technical questions however often these discussions end up being philosophical.
Yann LeCun is the Director of AI Research at Facebook and Professor at NYU.
Andrew Ng giving a Coursera ML lecture — he is chief scientist at Baidu while still involved with Stanford and Coursera.
Story from Africa
We have access to quality MOOCs and the leading experts that share their work and ideas in public forums. If you were to add to the amount of freely available open source Deep Learning libraries, would you believe we now have the ingredients for some incredible things to happen in places outside of Silicon Valley, e.g. in South Africa?
The answer is yes, a few emails across the world got me in touch with a small Sandton company called Isazi Consulting doing some amazing things using Deep Learning.
For Dario (right of the picture) at Isazi Consulting he explains how it all began as an annual week long ‘hackathon’, taking difficult problems across a range of industries, and solving them. Now they are more than 20 people working full time. However the status quo has remained, taking difficult problems and solving them — often using Deep Learning. To name a few of the projects they are involved in; they have Deep Learning models working in the field of radiology finding anomalies, they work with banks to find fraudulent patterns from millions of transactions, and they are working on training models doing language translation in African languages.
Democratisation of AI
The role that the democratisation of AI has played in Isazi’s success is undeniable. Through the democratisation of AI, Isazi have been able to collaborate with universities and experts across the world, use open source tools and apply a continuous stream of new algorithms improving on the previous. A recipe for success.
Africa has a golden opportunity to be a leader in Deep Learning, we have the diversity to challenge the norm, the passion to drive it and the tools needed to make it happen. The ball is now in our court.
Lets build something
I have chosen the Dogs vs. Cats data set from the popular Kaggle competition for the small demo.
For traditional ML this is a difficult problem, there is so much overlap between cats and dogs being colour, texture and shape. For Deep Learning this is a fairly simple problem, the cutting edge models obtains a 97% accuracy, and my toy model for the purpose of showing you some code obtains a 90% accuracy.
To solve the Dogs vs. Cats problem we use a special kind of Deep Learning architecture namely Convolution Neural Networks.
The main idea behind Convolutional Neural Networks is to slide a kernel across an image and extract the most import features from the sample image. By stacking a number of layers on top of each other you increasingly extract higher levels of features from your sample images. Feeding the top layer into fully connected layers it is easy to learn the features that make different categories unique.
The code below is from the excellent tutorial by Francois Chollet.
The true skill lies in the architecture of the model and optimising the hyper- parameters for the problem at hand, rather than in thousands and thousands of lines of complex code.