Analytics Vidhya
Published in

Analytics Vidhya

Tensorflow 2 for Deep Learning -Linear Regression

Code files will be available at : https://github.com/ashwinhprasad/Tensorflow-2.0

What is Linear Regression ?

Linear regression is basically using a equation of a line to find out the linear relationship between 2 variables. By finding the linear relationship, I mean finding the parameters ( Slope and Intercept).

y = m*x + c
y : dependent variable
x : independent variable
m : slope
c : intercept

example of linear relationship between 2 variables

In the above image , SAT math is the Independent variable and College GPA scores is the dependent variable and it is clear that there is a linear relationship between these 2 variables. hence, we can use a line and predict college gpa scores for new students given thier sat math , once we found slope and intercept.

Linear Regression with Tensorflow 2

1. Importing the required Libraries

#importing the libraries
import tensorflow as tf
import
pandas as pd
import
numpy as np
import
matplotlib.pyplot as plt
import
seaborn as sns

2. Importing the Dataset

Importing the dataset. This dataset is a simple dataset from sklearn. you can use any dataset where the independent variable and dependent variable have a linear relationship

#data preprocessing
#train set
data_train = pd.read_csv('train.csv')
data_test = pd.read_csv('test.csv')

#removing null values
data_train = data_train.dropna(axis=0,how='any')

3. Train Test Split

Splitting the data into training and test set

#xtrain, x_test , y_train, y_test
x_train = data_train['x']
y_train = data_train['y']
x_test = data_test['x']
y_test = data_test['y']

4. Plotting the data

plt.figure(figsize=(8,6)) plt.scatter(x_train,y_train)
data : x vs y

5. Converting the data to numpy

#converting to numpy
x_train = np.array(x_train).reshape(-1,1)
y_train = np.array(y_train)
print(x_train.shape,y_train.shape)
output:
(699, 1) (699,)

6. Creating a Sequential Model with TF2

Sequential Layer allows stacking of one layer on top of the other , enabling the data to flow through them

There are 2 ways to do this as shown in the below cell

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

model = Sequential()
model.add(Dense(1))

"""
(Alternative way for dedfining a sequential model)
model = Sequential([
Dense(1)
])"""

7. Optimizer and Gradient Descent

we use mini-batch gradient descent optimizer here. and mean square loss for this problem

from tensorflow.keras.optimizers import SGD
from tensorflow.keras.losses import mse
model.compile(optimizer=SGD(learning_rate=0.0001),loss=mse)
train = model.fit(x_train,y_train,epochs=50)

8. Performance Analysis

The loss is reduced over time and the line of fit of the model seems to be the best fit for the below data

#loss over time
plt.plot(train.history['loss'],label='loss')
plt.xlabel('epochs')
plt.ylabel('loss')
plt.legend()
loss over time

9. Testing the Model on the Test Set

#testing the model
y_pred = model.predict(np.array(x_test).reshape(-1,1))
plt.figure(figsize=(10,8))
plt.plot(x_test,y_pred,color='red',linewidth=2,label='Line of best Fit')
plt.scatter(x_test,y_test,label='Test data plots')
plt.legend()
Line of best fit

Conclusion

The model learns these parameters (slope and bias) from an algorithm called gradient descent. To learn how gradient descent works , check out : https://medium.com/analytics-vidhya/linear-regression-with-gradient-descent-derivation-c10685ddf0f4.
So, The parameters of the line of best fit would be in such a way that the difference between the points in the line and the sum of data points that are vertically away from it is minimum. This can be better understood by going through the gradient descent blog post.

Once we get the equation of this new line, we can predict y’s for new value of x’s .

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Ashwin Prasad

Ashwin Prasad

Artificial Intelligence and Data Science Enthusiast. Updating Neural Network parameters since 2002.