Linear Regression using TensorFlow

amirsina torfi
Machine Learning Mindset
3 min readFeb 27, 2019

--

This tutorial deals with training a linear regression. To explore more content on our blog, please refer to the original post.

In machine learning and statistics, Linear Regression is the modeling of the relationship between a variable such as Y and at least one independent variable as X. In the linear regression, the linear relationships will be modeled by a predictor function which its parameters will be estimated by the data and is called a Linear Model. The main advantage of Linear Regression algorithm is its simplicity using which it is very straightforward to interpret the new model and map the data into a new space. In this article, we will introduce how to train a linear model using TensorFLow and how to showcase the generated model. The source code is available in the associated GitHub repository. The project page is also available here.

Description of the Linear Regression Overall Process

In order to train the model, the TensorFlow loops through the data and it should find the optimal line (as we have a linear model) that fits the data. The linear relationship between two variables of X,Y is estimated by designing an appropriate optimization problem which its requirement is a proper loss function. The dataset is available from the Stanford course CS 20SI: TensorFlow for Deep Learning Research.

How to Do It in Code?

The process is started by loading the necessary libraries and the dataset:

# Data file provided by the Stanford course CS 20SI: TensorFlow for Deep Learning Research.
# https://github.com/chiphuyen/tf-stanford-tutorials
DATA_FILE = "data/fire_theft.xls"
# read the data from the .xls file.
book = xlrd.open_workbook(DATA_FILE, encoding_override="utf-8")
sheet = book.sheet_by_index(0)
data = np.asarray([sheet.row_values(i) for i in range(1, sheet.nrows)])
num_samples = sheet.nrows - 1
#######################
## Defining flags #####
#######################
tf.app.flags.DEFINE_integer(
'num_epochs', 50, 'The number of epochs for training the model. Default=50')
# Store all elements in FLAG structure!
FLAGS = tf.app.flags.FLAGS

Then we continue by defining and initializing the necessary variables:

# creating the weight and bias.
# The defined variables will be initialized to zero.
W = tf.Variable(0.0, name="weights")
b = tf.Variable(0.0, name="bias")

After that, we should define the necessary functions. Different tabs demonstrate the defined functions:

  • input
  • inference phase
  • loss function
  • train
def inputs():
"""
Defining the place_holders.
:return:
Returning the data and label lace holders.
"""
X = tf.placeholder(tf.float32, name="X")
Y = tf.placeholder(tf.float32, name="Y")
return X,Y

Next, we are going to loop through different epochs of data and perform the optimization process:

with tf.Session() as sess:    # Initialize the variables[w and b].
sess.run(tf.global_variables_initializer())
# Get the input tensors
X, Y = inputs()
# Return the train loss and create the train_op.
train_loss = loss(X, Y)
train_op = train(train_loss)
# Step 8: train the model
for epoch_num in range(FLAGS.num_epochs): # run 100 epochs
for x, y in data:
train_op = train(train_loss)
# Session runs train_op to minimize loss
loss_value,_ = sess.run([train_loss,train_op], feed_dict={X: x, Y: y})
# Displaying the loss per epoch.
print('epoch %d, loss=%f' %(epoch_num+1, loss_value))
# save the values of weight and bias
wcoeff, bias = sess.run([W, b])

In the above code, the sess.run(tf.global_variables_initializer()) initialize all the defined variables globally. The train_op is built upon the train_loss and will be updated in each step. In the end, the parameters of the linear model, e.g., wcoeff and bias, will be returned. For evaluation, the prediction line and the original data will be demonstrated to show how the model fits the data:

###############################
#### Evaluate and plot ########
###############################
Input_values = data[:,0]
Labels = data[:,1]
Prediction_values = data[:,0] * wcoeff + bias
plt.plot(Input_values, Labels, 'ro', label='main')
plt.plot(Input_values, Prediction_values, label='Predicted')
# Saving the result.
plt.legend()
plt.savefig('plot.png')
plt.close()

The result is depicted in the following figure:

The above animated GIF shows the model with some tiny movements which demonstrate the updating process. As it can be observed, the linear model is not certainly kind of a bests! However, as we mentioned, its simplicity is its advantage!

Summary

In this tutorial, we walked through the linear model creation using TensorFlow. The line which was found after training, is not guaranteed to be the best one. Different parameters affect the convergence accuracy. The linear model is found using stochastic optimization and its simplicity makes our world easier.

--

--