# Tensorflow 2 for Deep Learning -Linear Regression

*Code files will be available at : **https://github.com/ashwinhprasad/Tensorflow-2.0*

# What is Linear Regression ?

Linear regression is basically using a equation of a line to find out the linear relationship between 2 variables. By finding the linear relationship, I mean finding the parameters ( Slope and Intercept).

**y = m*x + cy : dependent variablex : independent variablem : slopec : intercept**

In the above image , SAT math is the Independent variable and College GPA scores is the dependent variable and it is clear that there is a linear relationship between these 2 variables. hence, we can use a line and predict college gpa scores for new students given thier sat math , once we found slope and intercept.

# Linear Regression with Tensorflow 2

**1. Importing the required Libraries**

*#importing the libraries*

**import** **tensorflow** **as** **tf**

import **pandas** **as** **pd**

import **numpy** **as** **np**

import **matplotlib.pyplot** **as** **plt**

import **seaborn** **as** **sns**

**2. Importing the Dataset**

Importing the dataset. This dataset is a simple dataset from sklearn. you can use any dataset where the independent variable and dependent variable have a linear relationship

*#data preprocessing*

*#train set*

data_train = pd.read_csv('train.csv')

data_test = pd.read_csv('test.csv')

*#removing null values*

data_train = data_train.dropna(axis=0,how='any')

**3. Train Test Split**

Splitting the data into training and test set

*#xtrain, x_test , y_train, y_test*

x_train = data_train['x']

y_train = data_train['y']

x_test = data_test['x']

y_test = data_test['y']

**4. Plotting the data**

`plt.figure(figsize=(8,6)) plt.scatter(x_train,y_train)`

**5. Converting the data to numpy**

#converting to numpy

x_train = np.array(x_train).reshape(-1,1)

y_train = np.array(y_train)

print(x_train.shape,y_train.shape)output:

(699, 1) (699,)

**6. Creating a Sequential Model with TF2**

Sequential Layer allows stacking of one layer on top of the other , enabling the data to flow through them

There are 2 ways to do this as shown in the below cell

**from** **tensorflow.keras.models** **import** Sequential

**from** **tensorflow.keras.layers** **import** Dense

model = Sequential()

model.add(Dense(1))

*"""*

*(Alternative way for dedfining a sequential model)*

*model = Sequential([*

* Dense(1)*

*])"""*

## 7. Optimizer and Gradient Descent

we use mini-batch gradient descent optimizer here. and mean square loss for this problem

**from** **tensorflow.keras.optimizers** **import** SGD

**from** **tensorflow.keras.losses** **import** mse

model.compile(optimizer=SGD(learning_rate=0.0001),loss=mse)

train = model.fit(x_train,y_train,epochs=50)

## 8. Performance Analysis

The loss is reduced over time and the line of fit of the model seems to be the best fit for the below data

*#loss over time*

plt.plot(train.history['loss'],label='loss')

plt.xlabel('epochs')

plt.ylabel('loss')

plt.legend()

## 9. Testing the Model on the Test Set

*#testing the model*

y_pred = model.predict(np.array(x_test).reshape(-1,1))

plt.figure(figsize=(10,8))

plt.plot(x_test,y_pred,color='red',linewidth=2,label='Line of best Fit')

plt.scatter(x_test,y_test,label='Test data plots')

plt.legend()

# Conclusion

The model learns these parameters (slope and bias) from an algorithm called gradient descent. To learn how gradient descent works , check out : https://medium.com/analytics-vidhya/linear-regression-with-gradient-descent-derivation-c10685ddf0f4.

So, The parameters of the line of best fit would be in such a way that the difference between the points in the line and the sum of data points that are vertically away from it is minimum. This can be better understood by going through the gradient descent blog post.

Once we get the equation of this new line, we can predict y’s for new value of x’s .