Deep learning for Python developers (and the average Joe who is just curious about the stuff)

Published in

Learn stuff with Ed

18 min readApr 15, 2020

This article used to be a mini ebook — a few hundred people got it for free and a couple dozen bought it at a very cheap price, but I figured Medium would be the most efficient way to reach people, so I since removed the book from Amazon.com & Kindle, updated and simplified it, and what you are about to read is the result.

Introduction

Wrote by a developer, for developers, this article is written in a language better-suited for professionals with prior experience in programming that want to get themselves up to speed with the latest Deep Learning trends and want to understand basic Artificial Intelligence concepts, jargons, and technology. Expect to get your hands dirty!

Throughout this, I’ll assumes you know nothing about Artificial Intelligence and will hold your hand throughout the entire process, the knowledge will be given to you when you need it, never before, preventing overflow of information which is what I have identified as the most common problem people face when trying to learn about data science, artificial intelligence, and or machine learning.

By the end of this article, you will have used Python to train a model, make predictions, and have leveraged supervised learning techniques.

How to use this

I recommend reading somewhere with access to a computer so that you can practice the examples. And do practice the examples, it’s the best way to fixate knowledge.

This article comes with a glossary, the glossary contains key definitions that will be fundamental to your understanding of the contents of this book. I recommend that you quickly scan the glossary but don’t try to memorise its content, just come back to the glossary every time you find a term you are not familiar with!

If you find a term in italic during your reading, that means you can find its definition in the glossary.

The Artificial Intelligence field contains several jargons and has various nuances that seem foreign and very unfamiliar, throughout the reading if you encounter yourself overwhelmed by new terms and concepts, power through them, a practical example should come right after which will hopefully make things clear to you.

Talk to the talk — Glossary

Activation Function — The activation function of a node defines the output of that node given an input or set of inputs. Some activation function algorithms are: relu, softmax and sigmoid.

Dataset — A collection of samples

Feature — A variable that defined a characteristic of something. E.g: alcohol level is a feature of a beverage.

Flask — Flask is a micro web framework written in Python.

Hidden Layer — A hidden layer is a layer whose output is connected to the inputs of other layers and is therefore not visible as a network output. Hidden layers are fed with data from the input layer or from another hidden layer and feed its results to another hidden layer or to the output layer.

Initialisation — Initializations define the way to set the initial random weights of Keras layers.

Input Layer / Entry Layer — The input layer is the first layer or the entry layer of a Neural Network. It’s used to feed information into the network for processing.

Keras — A high-level neural networks API, written in Python and capable of running on top of TensorFlow, CNTK, or Theano.

Metric — A function that is used to judge the performance of your model

Model — Artifact product of the training process.

MongoDB — MongoDB is a free and open-source cross-platform document-oriented database program. Classified as a NoSQL database program, MongoDB uses JSON-like documents with schemas.

Node — Akin to the vast network of neurons in a brain. Nodes are what layers are composed of.

NumPy — A Python library that adds support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays.

Optimizer — Optimisation algorithm for a deep learning model.

Output Layer — The output is the last layer of a Neural Network. This layer is fed with the results of previous layers and is responsible for coming up with a solution.

Sample — A set of features of a specific thing. E.g: body mass, age, diastolic blood pressure are a set of features that form a sample that describes a diabetes patient.

TensorFlow — TensorFlow is an open-source software library for dataflow programming across a range of tasks. It is a symbolic math library, and also used for machine learning applications such as neural networks.

What Machine Learning Actually is?

Machine Learning is a field of statistics and computer science that gives computer systems the ability to “learn”, or better yet, to progressively improve performance on a specific task, leveraging data, without being explicitly programmed.

The term was first used in 1959 (yep, that long!), it evolved from a study of pattern recognition and computational learning theory in artificial intelligence. Machine Learning explores the study and construction of algorithms that can learn from and make predictions on data — such algorithms overcome following strictly static program instructions by making data-driven predictions or decisions.

Nowadays, the most common form of Machine Learning is through the use of Neural Networks.

What are Neural Networks?

I really like Maureen Caudill’s definition of Neural Networks, “a computing system made up of a number of simple, highly interconnected processing elements, which process information by their dynamic state response to external inputs”. In short, much like any other thing computer related, something is input, that something then gets processed, and something else gets output.

The catch is that Neural Networks try to mimic how the brain works, by processing information through interconnected processing elements called nodes grouped by layers.

These nodes contain an activation function. Patterns are presented to the network via the input layer, which communicates with one or more hidden layers where the actual processing is done via a system of weighted connections. The hidden layers then link to an output layer where the answer to the problem it’s trying to solve is output.

What is Deep Learning?

Deep learning is part of a broader family of machine learning methods based on learning data representations, as opposed to task-specific algorithms.

Learning can be supervised, semi-supervised or unsupervised. Deep learning models are loosely related to information processing and communication patterns in a biological nervous system. Deep learning architectures such as deep neural networks, deep belief networks and recurrent neural networks have been applied to fields including computer vision, speech recognition, natural language processing, audio recognition, social network filtering, machine translation, bioinformatics and drug design, where they have produced results comparable to and in some cases superior to human experts.

Enough theory, let’s get our hands dirty

From now on I’m assuming you have access to a computer with Python 3 installed and is able to run scripts. If not, go sort that out and come back! If you don’t know where to start, have a look at https://mlis.fun/start-here/get-started-with-python/.

You will need to install a few dependencies right out of the bat such as TensorFlow, h5, NumPy and Keras. Get that sorted too before we proceed.

Thirteen lines of code! That’s all that is going to take to train your first model using Keras. This is what the neural network will look like, more or less:

Just a reminder, make sure you check the meaning of these layers in the glossary in case you are in doubt.

We will use a dataset that contains information about patients with diabetes, they contain the variables mentioned above on this order:

Number of times pregnant
Plasma glucose concentration a 2 hours in an oral glucose tolerance test
Diastolic blood pressure (mm Hg)
Triceps skinfold thickness (mm)
2-Hour serum insulin (mu U/ml)
Body mass index (weight in kg/(height in m)²)
Diabetes pedigree function
Age (years)

In short what’s going to happen is, we will insert the features above in our neural network through the input layer, that’s going to get processed and sent to the first hidden layer, then to the second, and finally to the output layer that will generate a result.

If this is getting confusing, don’t worry, read again, if still a bit blurry, power through it, it will make more sense later.

I used a public machine learning dataset retrieve this data, you can download it here.

The dataset is a CSV with 9 columns, 8 with the variables listed above and a ninth one containing a boolean informing whether the person has diabetes or not.

Let’s get coding:

The code above expects you to have downloaded the dataset and placed it in the same folder as the script above and that you have named it as “diabetes.csv”, as seen on line 6.

The first thing we will do after loading dependencies and the dataset, is to split the input from the output. See lines 9 and 10.

From line 13 to 16, we’re making sure the neural network looks as proposed.

Line 14 ensures there will be 12 nodes in the input layer, that will take 8 features. We’re then telling the initialiser to use a “uniform” initialization. We chose the activation function ReLU. Activation functions almost deserve a book of their own, so let’s skip past that for now or things will get too complicated.

Line 15 creates a hidden layer. The results of the input layer will be fed into the hidden layer which will feed the output layer defined on line 16.

For the output layer, we only want to get one result, so there will be only one node. We also changed the activation function to sigmoid.

It’s time to compile the model so it’s ready for training, that’s exactly what we do on line 19. The optimizer of choice is “Adam”. Adam is all about performance, that’s why we’re choosing it for now. Finally, we’re setting the metric to “accuracy”, as that’s what we want to get out of it.

On line 22 we start fitting our model, yes we are finally training it. By setting the epoch to 600, we’re telling Keras to pass the entire dataset through the network 600 times, the batch size of 10 means we will send 10 samples at a time. This values are somewhat arbitrary, you can play with it and try to get the best results, there are some heuristics around it, but let’s not get into it now.

On line 24, we’are saving it into a file that we can use later.

Right! It’s time to finally test the accuracy of this thing, thankfully, Keras provides ways to evaluate the model and find out its precision.

Again, very simple code, 8 lines of code if you don’t count empty lines.

Up to line 8 there’s nothing new, on line 9 we load the model we trained with the previous script.

On line 11 we use the evaluate function that is built in. And on line 12, we simply print out. If you didn’t make modifications to the code, you should get an accuracy of just over 80%. Pretty decent, given the small dataset, huh?

Not it’s time to get to the fun part, making predictions!

I will just hand you the code:

Nothing new all the way to line 6, on line 7 we’re loading the features of a diabetes patient. On line 9 we’re telling Keras to use our model to predict whether the sample we’ve given has diabetes or not. Keras will return with it’s confidence levels, we’re just going to round it, if the confidence is below 0.5, we will say the person doesn’t have it, if it’s above, we will say the person has diabetes.

On line 12 we just print the results. That simple.

Example Project

The example project is meant to recap, fixate and introduce some more advanced concepts. We will build an end to end deep learning application through neural networks that leverages supervised learning for continuous learning and improvement of the model.

The project will be divided in small fully functional chunks of work, what that means is that for every thing we start we will finish in the same section of the book, on the following section we will pick up where we left and continue improving on the application and adding more functionality. If you have to stop reading, I recommend doing so in between sections so you don’t leave in the middle of the work, making it harder to come back later.

From now on, you should already be familiar with the jargons, so I will stop making the words italic. If you are in doubt about the meaning of something, you can always revisit the glossary. Whereas I expect you to be familiar with the jargons, there is no expectation that you will be familiar with the concepts, so I will continuing reinforcing them as we move forward. If you don’t understand what we’ve done in the previous example, I recommend coming back to it one more time (come on, it’s just a quick read!).

Supervised Learning

Supervised learning is the machine learning task of learning a function that maps an input to an output based on example input-output pairs. It infers a function from labeled training data consisting of a set of training examples. All of that to say that supervised learning is exactly what we’ve been doing so far.

We had a dataset with inputs and outputs and we used it to train our model.

The key difference of this new project is that we will use the output of predictions to further train the model, therefore making it smarter over time, the more you predict, the more training data you will have, therefore the more samples are added to the dataset and our model can be further trained.

Ok, enough mystery, what are we actually building?

This time around we will build a neural network capable of predicting if a wine is good or bad. Here’s a breakdown of the Python application we will build:

Script to load an initial dataset into MongoDB
Script to generate a model using data stored in MongoDB
Script to perform an evaluation of the model precision
A REST API built in Flask that receives wine features and outputs a prediction of its quality

Here’s the dataset we will use.

Make sure you save and download it, we will need it for later.

MongoDB

If you have MongoDB set up in your computer, you can skip to the next section!

If you don’t have MongoDB set up in your compute, the easiest way to proceed is to create a free mLab account. Head to https://mlab.com and get yourself an account.

Once you log in, hit “create new” to create a brand new database, choose the sandbox plan which is the free one, it will then let you pick a region to deploy the database, choose the one nearest to where you live. Give it a name and hit continue.

Once it’s created, click on it to go into the database settings area. Then create a user and give it a password.

You should be able to see your MongoDB URI, here’s what one looks like:

mongodb://<dbuser>:<dbpassword>@ds253918.mlab.com:53918/wine-ml

And here’s what I see at this stage:

Save all of this we will need it in a bit.

Folder structure

The following image represents the folder structure we will follow for this project. I recommend you start creating these folders and files now even if they are completely empty.

There’s not a lot to it, this is just so we don’t have everything in the same folder.

Setting up the environment

In the ‘.env’ file, we will have something like this:

Be sure to update the above with your mLab credentials.

It’s also a good practice to include a .python-version containing the version of Python we want to use. This project was tested using 3.6.1, so your ‘.python-version’ would look like this:

And finally, the project requirements:

Script 1 — Loading the dataset into MongoDB

Let’s start our code.

If you haven’t yet, create the config folder and include a file named ‘mongo.py’ in it. Don’t forget to add an empty ‘__init__.py’ too.

Take a minute to read through the code, there is no rocket science going on, we are using a library called ‘python-dotenv’ to be able to read from the ‘.env’ file, there is a few functions for returning each value individually and one called ‘uri’ that concatenates them and returns the entire uri.

It’s time to create the wine model. This class will be responsible for the interaction with MongoDB. Note that we are using ‘Pymongo’ to handle the interaction with MongoDB and simply importing the configuration from the mongo config file previously created.

There’s also an ‘insert’ method that does exactly what its name says, it allows for inserting a new wine sample into the database.

Now it’s time to create ‘load.py’. This script will be responsible for loading the dataset from the storage folder and load it into MongoDB. Remember, we need to have all the data in Mongo so that we can add more to it later and be able to regenerate the model once there’s more training data available.

The script above loads the dataset then loops through all samples and inserts them into MongoDB. Run the script above, it should take a while to finish and when successful, the collections view in your mLab panel should look more or less like this:

At this you have the dataset all loaded in MongoDB and you are ready to get to the next level.

Script 2 — Training the model

As previously stated, we will be adding new samples to the Mongo database, what that means is that there will be a constant flow of new samples that can be used to improve the model. That also means that we need to have means to read from the database to rebuild the model as needed.

The first obvious thing we need is a way to extract all wine samples from the database, to achieve this you will need a new function added to the wine model:

I used the same code from the diabetes example, turned it into a class for better reusability and placed it under app/services/wine_neural_network.py. Here is what that looks like:

I created very specific methods for each of the critical actions, “train”, “evaluate” and “predict”, also the class expects to receive an input, and output and the path to the model.

The last thing required to be able to train the model is to create the training script that will tie this all together:

I appreciate that over the last few pages I simply threw a lot of code at you, but this is to mark the very end of the theoretical part of this book, it’s really time to get our hands dirty.

“Train.py” loads all samples from the database, then parses it into a numpy array then feeds the input and the output of that into the neural network whilst specifying the path where the model should be saved.

Imagine your system is constantly receiving new samples that can be used for training, maybe you want to constantly re-train your model to make it better leveraging these new samples, all you would have to do is set up a task in the server to run ”train.py” every now to rebuild the model. Maybe every hour, every day or every month depending on your needs.

A quick recap

The last few pages have been a bit hard core but we also achieved quite a bit:

A well defined project structure
A solid base for handling MongoDB
The base for the neural network inspired by the previous example
Ability to run the first load of the dataset into Mongo
Ability to re-train the Model in a quick and scalable manner

server.py

This will be the keystone of the application, it will leverage the microframework Flask, server.py will wire up everything we done so far. To start with, we will create a REST endpoint that will take a json body like the below and perform an evaluation as to whether the wine is of good quality or not based on its features.

Here is what ‘server.py’ looks like:

Running the above will start a web server on port 5000 that can receive post requests.

The server will return a response like this when the wine is of good quality:

The server will return a response like this when the wine is not of good quality:

We are also doing some validation to check if the sample information is present within the request. I recommend using Postman for testing this, but a CURL request is also ok and would look like this:

What this does not do yet, is to save the sample into our database. Remember, we want to collect more samples over time, so that we can rate it later and use it to improve our model!

To accommodate that, we will have to make some changes into our code.

Starting with the wine model, it should allow us to insert samples without rate — as we don’t expect a rate when evaluating a wine — at the moment this is not possible, to make it possible, we need to change the “insert” method slightly, to this:

That way samples without a rate will be marked as “-1” in the database, that way we can separate unrated from rated samples.

Also we don’t want to use unrated samples when building the model, so we need to change the “all” method too to only return rated samples.

And let’s not forget to actually save the new sample by adding this to “server.py”:

Continuous learning

At this point you should have a fully functional REST API capable of analysing wine samples and make predictions as to whether the wine is good or bad. You are also saving these samples for posterior rating, which allows for continuous learning in your deep learning model.

You can now go into mLab and manually add rate do these wines:

After doing so, you can re-run the training script and instantly your model becomes stronger and more accurate!

What is next?

It’s only been a few minutes (an hour, maybe?) studying an article full of pictures and lines of code, really not too bad huh? Just imagine what you can achieve with a few more days of study!

This project could be improved by adding a dashboard to it where all the samples that haven’t been rated could be seen and rated, perhaps using crowdsource. The panel could also display the accuracy of the model and etc.

I hope you enjoyed this, and I hope it set you up with a good starting point and the basic knowledge to have you continue forward on your learning journey, thank you for reading!

def database():
return str(os.getenv(‘MONGO_DATABASE’))

def uri():
return ‘mongodb://’ + username() + ‘:’ + password() + \
‘@’ + host() + ‘:’ + port() + ‘/’ + database() + \
‘?authMechanism=SCRAM-SHA-1’