Mini ANN to predict sales revenue using Tensorflow + Python with Google Colab

Published in

Analytics Vidhya

7 min readAug 30, 2020

Introduction

In the current market, every company would like to predict/forecast the sale of their products based on different parameters. That helps any organisation to get planned and fulfil everything that based on future prediction. Such kind of prediction helps and protect the organisation from any upcoming losses or many time allow making more profit.

Business Challenge

A company owns an ice cream business and it wants to create a model to predict daily revenue in dollars from ice cream sales based on temperature (in degree Celsius). For this purpose, the company decided to build a simple Artificial Neural Network to solve this problem. The provided data will have Air temperature as input and Overall daily revenue generated in dollars.

Machine Learning complete project

In this project, a Mini Artificial Neural Network will be developed which will function as a regression model and will predict based on input parameters. This model will be cross verified by using Regression Kit in Scikit learn (A Classical and very efficient machine learning method for such kind of problem).

Importing the Dataset

In this project, a dataset named SalesData.csv is used, which can be downloaded from the link https://drive.google.com/file/d/1pZeKsoDFMyTvxC-YndiHu_huto4WOreA/view?usp=sharing. Or also the complete dataset can be visualized from the link https://github.com/sushantkumar-estech/Prediction-of-sales-revenue-using-mini-ANN/blob/master/SalesData.csv.

The dataset 2D array 500rows X 2 column, where the first column shows the temperature, which will be used as input during the model training and second column shows Revenue, which will be used as output in the model training.

This is how the top rows of dataset look like:

and the last five rows of dataset as well:

also some more information about the dataset

And some more.

Details about the type of data in the dataset

Visualization of dataset

Below shows the curve between Temperature (in °C) and Revenue (in dollars). Each blue dot in the curve shows the amount of Revenue generated corresponds to the temperature in a day.

Temperature [°C] v/s Revenue [in dollars] curve

From the curve, it can be easily predicted that the Revenue is linearly related to day temperature i.e if the day temperature will increase, Revenue will also increase. But there is no slope (m) and intercept (b) known. ( note: line equation is Y = mX+b, where ‘m’ is the Slope of a line and ‘b’ is Intercept).

Creating the model

As mentioned, the model is a mini artificial neural network which consists of one input layer, on hidden layers and one output layer in total. As there is one value i.e. X which is Temperature (in °C) in this project, based on which another value i.e. Y which is Revenue (in dollars) will be calculated. That’s why the model has only one neuron in the input layer and the output layer. Also as mentioned it will be mini ANN, so the hidden layer has only one neuron, and that will be sufficient for such an application. All the 3 layers, with one neuron each, are fully connected with each other.

The created Model summary is as follows:

As mentioned it’s a mini Artificial Neural Network, which has only 2 parameters only, which needs to be trained. One will represent the Slope (m) and other will be Intercept (b) to complete the line equation, which is Y=mX+b.

Training the model

For training the model, the dataset needs to be pass or in other terms, the model needs to be fitted to the dataset. While training, parameters like optimization function, number of epochs, validation split etc. needs to be decided. There is an impact of these values on the training of the model. A model must always be protected from over-fitting or under-fitting.

For this project ‘Adam’ optimizer, 100 epochs & validation split = 0.2 been chosen. The model gets trained on 400 values from the dataset and simultaneously validated for 100 values.

As the model started getting trained, epochs by epochs, the model gets better and the losses started getting reduced. The performance summary of the model while training on training and validation data during the first 10 epochs is as follows:

As can be seen from the summary, the training loss after 1st epoch is 312605.4725 and validation loss is 336566.4600. ( Loss calculated here is mean squared error which is calculated by the summing up all the difference between actual output and predicted output and divided by the numbers of input the dataset i.e. 400 for the training dataset and 100 for validation dataset. To know more about ‘mean squared error’, please check the link: https://www.freecodecamp.org/news/machine-learning-mean-squared-error-regression-line-c7dde9a26b93/)

The performance summary of the model from 36 to 46 epochs can be seen below:

As here, it can be seen the reduction in losses, which can down from 1060.9358 to 644.7025 for training losses and 1372.5654 to 876.7715 for validation losses.

The performance summary of the model during the last 10 epochs can be seen below:

During the last epochs, the losses become stable and are not getting reduced any further. So these are the minimum losses the model have with the values of the given parameters

Evaluating the Model

The model loss v/s Epochs graph shows how the Training loss get reduced with increase in the number of Epochs.

As can be seen, the losses reduced as the Epochs increases, but they become saturated after 35 epochs and a further reduction in loss are negligible.

The same kind of behaviour is shown by Validation losses as well. That can be seen in the following graph, they also get reduced as the number of epochs increases, but got almost saturated after 35 epochs. The blue line shows Training loss and red line representing Validation loss.

100 epochs are more than sufficient for training such a model with the mentioned hyper-parameters. The model can be trained with fewer epochs as well and can be achieved the same accuracy as achieved after training it with 100 epochs.

As mentioned earlier, there were 2 parameters in the model which are needs to be trained. So after training, their values are 22.191706 and 25.261503. Thus for linear line equation, the Slope (m) = 22.191706 and Intercept(b)=25.261503, which makes the equation Y=22.191706*X+25.261503, or for this project, it will be Revenue(dollars)=22.191706*Temperature(°C)+25.261503.

Below shows, the mini ANN model predicted output on the same dataset using the above-mentioned model equation. (Of course, model do it by itself, only the values needs to be passed through it). It can be seen here the model-predicted line (shown in red colour) is best fitted with the dataset. Model is performing well on the available dataset.

Performing prediction using model

For making a prediction using trained mini ANN model, Temperature value of 5°C is passed through the model, which gave the output of Revenue = 134.55237.

Confirming the model using Regression Kit in Scikit learn

Scikit learn have available model, which is dedicated for performing regression task. For this LinearRegression is called and trained on the same data set. After training the parameters, Slope (m) is 21.44362551 and Intercept (b) is 44.83126709.

On comparing these parameters values with the values received after training the mini ANN, the slope difference is 0.74808049 and intercept difference is 19.5697641.

Below shows, the Scikit learn Linear Regression model’s predicted output on the same dataset.

For prediction using trained Scikit learn Linear Regression model, Temperature value of 5°C is passed through the model, which gave the output of Revenue = 152.04939464. Comparing the Scikit learn Linear Regression model’ output withe mini ANN model’s output, there is a Revenue difference of 17.497025 dollars, which is because of difference in their Slope(m) and Intercept(b) values.

Conclusion

The mini ANN model performs well if compared with the given dataset value and fall in the same range of Revenue output as the Scikit learn Linear Regression model falls. The mini ANN model can make better by changing hyperparameter values or can be by adding further hidden layer/s or by adding more neurons in the hidden layer/s.

Application

Such kind of model can be used in any business application where the prediction/forecast needs to make based on the available past dataset.

Reference

The full code the dataset can be downloaded from the following Github repository.

https://github.com/sushantkumar-estech/Prediction-of-sales-revenue-using-mini-ANN.git

Note

Also if you are a beginner in machine learning and enthusiast to learn more, then you can search for GitHub account sushantkumar-estech or can use the link https://github.com/sushantkumar-estech for interesting projects

Select any project from your wish for practice and in case of any question, you can write to me. I would be happy to help.

Enjoy reading and happy learning!!