ML Programming Made [too much] Easy — Part 2

Published in

HackerNoon.com

4 min readJun 8, 2018

Machine Learning (ML), especially Artificial Neural Network (ANN), is a powerful tool that we can use today — you can read all about it in part 1. Given training dataset we can train a model that will given us predictions for certain features like predict a house price given the square foot, the number of bedrooms, etc. If you are lack of knowledge on ML or ANN please read part 1 which explains what is ANN and its building blocks. Next we are going to see how to build, train and predict a model using Tensorflow.

Tensorflow is an open source ML framework that made for everyone to use. Tensorflow is trying to keep simple things simple and complex things achievable. With Tensorflow ANN is simple to create, train and predict using a simple python API.

To create an ANN model, first we need to create an input layer with as many input nodes as the input parameters we want to feed to the model. Next we need to create the hidden layers which will give the model the depth it needs to persist the required data and finally we need to create the output layer which includes as many nodes as the output requires. All the layers weights and biases will be initialized and ready to be optimized. We also need to state other know parameters such as the loss function, the training optimization strategy and maybe a logger.

Here is an example for a Tensorflow ANN model written in python:

Looks quite harsh, let’s deep dive and break the complexity here. First we have two initialisation functions — one for the weights and one for the biases. zero_weights function creates an initial weight variable from one layer to another and zero_biases creates a bias variable for all the nodes available.

Next we are going to create all the layers in our ANN starting from the input layer. Assuming our input has 9 variables we create a new tf variable from None to 9 because it is the first layer. Then we create 3 more hidden layers, 50 nodes -> 100 nodes -> 50 nodes names ‘layer_1’, ‘layer_2’ and ‘layer_3’. Next we create the output layer which has only 1 output. Finally we create the loss function, named cost which is another name for the loss function. The loss function will be measured by the root mean square for find the closest value to the desired value and the optimization strategy is adam which is a method for stochastic optimization. It is the best strategy known today.

To train a newly created model we will need to feed it with the training data set. As the training process evolve, the weights and biases gets more accurate toward optimizing the inputs combinations and reducing the loss function.

Here is an example for a model training given X — inputs and Y — expected outputs:

Now, the model is just an object holding all the variables we previously created. Df is a data frame and the value is the expected output column. X will be everything beside the value column and Y will be only the Y column. Now we run the optimizer 50 epoch, each epoch is going to feed all the dataset and execute the optimizer to optimize the results. Then after each epoch we are going to calculate the loss function and watch it decrease over time.

We have to validate the created model with a new testing dataset. The testing data set have to be a new dataset that the model did not see before, and will show us that the error stays low, as it can produce wrong result because various reasons. One of the reasons can be overfitting which happens when the model is trained to good for the specific training dataset. It can be caused by too many epochs for example.

To create a Tensorflow model it requires ~35 lines of code which contains a lot of boilerplate code, default values and best practices that requires to be specified. Keras is a new tool that is trying to solve this problem. Keras is a high-level API build on top of other frameworks such as Tensorflow, and provides the ability to build ANN models with almost no effort.

To compare, the same Tensorflow model can be built with Keras with ~6 lines of code:

Keras provides a simplified version of the implementation and was build to provide a quick experimenting reference. Although Keras is lightweight it provides a good customization level and we can use it today not only for experimenting, but also for production level features. The implementation is much better and all the best practices are implemented behind the scene. It is almost too easy to use Keras and build ANNs quicker than you think.

ML Programming Made [too much] Easy — Part 2

Written by Alon Yehezkel