Create a Linear Regression model with Tensorflow and use it in an Android application

11 min readJul 22, 2018

In this project we will build a simple linear regression graph in Tensorflow that we will later use to predict values from an Android application.

Steps

We will create a graph in Tensorflow that will use linear regression for predicting values, for our loss process and we will use the root mean square and a standard gradient descent for our optimisation process.

After our model is built and trained, we will then save our model so that somebody else can use it. In our case, it will be our mobile application. But it this model could be used by anybody. You will find pre-trained models on the Internet for many different datasets that are being trained using different loss and optimisation functions.

Once we have our model saved, for actual usage we will go through a process called graph freeze. This process will remove unnecessary nodes in our graph that are not used in the step of inception (feed your model and get a result); it can also be optimized so that it consumes less memory and runs faster.

Linear regression model graph

A linear regression model uses this function y = Wx + b that is the function of a line. This means that our model will try to fit a straight line considering all the input data and a reference data (as this is a supervised training).

The values that define the type of line are W and b; with those two parameters, given an input X, you will get a result Y. We will know X and Y, so that our intention is that a program learns the best W and b that minimises the error between this line and all the provided points.

This y_output will be the final result after we feed our function with values. Notice that we are naming this node, this is very important because from the Android application we will request this node by name in order to retrieve the result. And this is also valid for any model that you import, you will need to know the name of the nodes that you want to use.

W and b are variables that our training method will modify in order to minimise our loss, so let's define them:

Now we have to define the last two crucial pieces of our model, our inputs and outputs. Let me explain a bit more about this step. Our model is going to be used in two different ways:

While training we are going to perform a supervised training. Meaning that we need working examples for inputs and outputs (correct pairs of x, y).
While inference we are going to feed the model with input values, and we will read output values predicted by our model (we will only provide x values).

For inference, we have already defined our y_output, but for training, we have to define our working example fields. Looking at our y_output function it needs an input x, let's define it:

This x will be used through inference as a source of values for our model and also through training for our input examples. Now we have just to define an output for our working examples, we'll call it y_input because we will feed it into our model (hence the input):

Now we will define a set of very simple working examples that will perfectly fit a line:

If we were to manually find out the linear function that fist this dataset we can just assign the value -1 to W and zero to be:

We will see at the end whether our model can find out those values…

Now knowing all our defined placeholders, we can see that our x_train will be feed into the placeholder x and our y_train will be feed into the placeholder y_input. And this is why these two are placeholders and not variables.

Training the model

In order to train a model, we need to define two functions.

The loss function

The loss function will tell the algorithm how far off we are between the expected output and the output that the model in that particular state is giving to us. The objective of the algorithm is to minimise that loss.

For this example, we will use the square of the difference between the expected output and the correct output

There are many different loss functions that you will find on the Internet, some functions work better for some particular scenarios than others, so it’s always good to use something that has been proven that works instead of reinventing the wheel.

The optimizer

The optimizer function will modify the variables of the function W and b for the following training step. There are lots of different optimizers, for this example, we will use a very common optimizer called gradient descent. This optimizer calculates the derivative between the current step and the previous step for all nodes in the graph so that the variables of the function can be slightly changed towards minimising the loss. The amount of change that the variables will assume is going to be defined by a parameter called training step.

I do not want to go into much detail about the importance of the training step but it’s a very interesting parameter. The gradient descent calculation is not a smooth uniform function, meaning that if you imagine an inverted mountain (where the lowest point is the lowest loss), the walls of that mountain can go up a little bit before going down again.

Now imagine that we are down the mountain by a particular amount, let’s say 1. It is possible that using this step you are stuck in that wall going a bit up because your step cannot go beyond it. Also, it could happen that if your step is too big, you never actually find the lowest point of the mountain because you jump from one wall to another.

Small steps Vs big steps from http://www.yaldex.com

The quantity of the learning rate also has a performance impact. Fine tuning it is very important and there are many different techniques for doing so. A very good one is to use a dynamic learning rate where you start with a small value, and while your loss is getting smaller, you increase it. But this is just one of the techniques that can be used.

By the way, this value of the model is called a hyperparameter, and there is a whole domain centred on fine-tuning hyperparameters one of my favourites is to use genetic algorithms. Two very interesting articles about this subject:

The optimizer that we are going to use for this example is the following:

We will create a training step function that will help us write less code afterwards. This function is going to tell the algorithm, this optimizer will try to minimise the following loss function:

Let’s train!

We have everything in place to train our model. As we are using a hand-written model in Tensorflow we will have to use its session and run everything as is expected. It could be possible to avoid all this by using Tensorflow’s build it estimators, but this is something that we won’t do for this example as what we want is to go through how everything works at the low level of implementation.

We have to create a session and then we have to initialise the variables that we have defined previously:

Now we want to execute our functions in Tensorflow, we can do that using this session.run(my_stuff), for example, we can print the loss of our current model before our training:

Following with the actual training of our model, executing our training step for a few iterations as a very simple approach. For actual projects, we wouldn’t just do it in this simple fashion. We would first split our data-set (expected correct inputs and outputs) into different buckets such as training, validation and test. We would shuffle the data and then go through them in epochs instead of going through them all at once. We can also tell the system to stop when the loss reaches a certain value so that we do not iterate through the data more times than needed, this is especially important when on production you might spend many hours or days training your models. But for now a simple approach will work:

Once we have our model trained we can print again our loss and we can also find out what are the values of the variables that the algorithm has modified:

If you remember our manually calculated values for W and b, the model find out pretty close values!

Finally, we are going to do inference on our trained model. We are going to feed some random values into our y_output:

Not bad for a very simple model with very unpolished values. We started with a loss of 41 and we ended up with a loss of 0.00003. And for the inference, we have got a precision of 0.01.

Save the graph

Now that we have our model trained what we might want is for somebody else to use it. You can find lots of pre-trained models on the internet and you have just created your first one, congratulations!

In order to save the model, we are going to use the saver function from Tensorflow. Before training the model (in our code, that’s before our for _ in range line) we are going to write into a file the graph that we have created. This will write our graph in a binary proto buffer (Google's solution for serializing structured data):

Then after our model is fully training we will save it. In production code, we will possibly want to save the model at different points of the training and not just after we finish. This is useful in case you want to continue the training after a certain point or even after certain hours of training. This will create a checkpoint file with all our variables with the final values.

Freeze and optimise the graph

Once we have our graph definition and our checkpoint with all the final variables, we can freeze the graph so that it can be optimised and prepared for production use.

Our graph definition is serialised in linear_regression.pb
Our checkpoint with final variables is serialised in linear_regression.ckpt

For freezing the graph we will need to feed a function with the definition of our graph, our checkpoint with all our trained variables and which output file we want to produce, in this case, frozen_linear_regression.pb

Once we have our graph frozen we will optimise the graph for inference. This is a three-step process.

First, we are going to load the frozen graph into memory.

Second, we are going to feed a function with our in-memory frozen graph, the input node name and the output node name that we defined previously in our model. This is one of the reasons we named those nodes.

Third, we will serialise the optimised graph:

Freezing and optimising the graph can take many different forms, it’s worth to go through the documentation so that you are aware of what’s possible and how to use it for your production environments.

Create an Android application with the necessary libraries

We are going to create an Android application that can use our trained model in order to make a prediction, this step is called do to inference on a model.

We are going to need:

Tensorflow libraries for Android
Import our model using the libraries
Use the model for inference

Tensorflow libraries for Android

It used to be very complex to use Tensorflow on Android because the library is written in C++, you use to had to build the libraries and create the JNI layer between Java and C++. You can still do so if you want and it might actually be beneficial for your project; building the libraries yourself can reduce the size of the libraries as you can target it for your specific model just compiling what you really need and also you can target the selected mobile architectures that you desire.

But for this example, we will use the latest invention from Google, released in 2017. We can now import the library with just one line of code in your /app/build.grade dependencies{...} file:

Once we get the library, we can directly use the Java interface TensorFlowInferenceInterface provided by import org.tensorflow.contrib.android.TensorFlowInferenceInterface;

We are then going to copy of optimised graph into the /app/src/main/assets folder in our Android project

Import the frozen graph and use it to make predictions

Now we need to define some values from our model:

The name of our input node from the graph
The name of our output node from the graph
The shape of our input
The path to our model

Once we have everything defined we are going to create an instance of the Tensorflow inference interface:

We now have everything in place for using our model and start predicting values. Once we have an input value (that can come from an EditText in our Android application) we will feed our inference with it, we will run it and then we will read the result.

Feed our inference model

Run the inference

Extract the results

Our value will be at results[0] as we know there is only one output for our input (also highlighted in our input shape)

And that is it! You have created from zero a model and you have ended up using it in an Android application, remember that all the code can be found in my GitHub repository

Next chapters we will take more complex models and problems.

I encourage you to download the code and try different inputs and outputs, check the loss function and the resulting values.

Thanks for reading!

Bonus

This was a very simple problem, can we do more? Yeah, we can. I would like to mention two extensions from this solution that might inspire you.

What if my domain is richer

We have seen in this example that we input one value and we get one value as an output. But in a real case scenario what we might have a bunch of values for single output or even a bunch of outputs.

For example, you might want to predict how many kilometres your car will make, that will be your unique output Y. But that value might depend on just not only one input but of a series of values like road temperature, wind, average speed and so on.

So how do we use multiple inputs? Very easy. Notice that our the input in this example x is an array. Every position of the array is a value that corresponds to an output value. We need to change this array by a matrix (array of arrays), and then you can use matrix multiplications for calculating W and b.

You are lucky and as long as the size of the arrays is correct, Tensorflow already knows how to perform matrix multiplications.

You can find more information on the following article:

Gentlest Intro to TensorFlow #3: Matrices & Multi-feature Linear Regression

If you do this your function is no longer going to be a line, but a plane

What if a line doesn’t really work for me

This is a very interesting question. We have explored how to create a model that tries to fit a line but maybe a line is not rich enough for coping with all the variability of your dataset.

What can we do then? We can go up the hierarchy and use a polynomial function but we aware that if you increase the order of the function you might overfit or create artefacts:

Wikipedia has a good article on polynomial regression and how it looks like.

Functions and approximations are a hot topic in Machine Learning. Searching for polynomial regression and deep learning will give you a bunch of links to explore.