Multiple linear regression refers to a statistical technique that is used to predict the outcome of a variable based on the value of two or more variables ( two or more independent variables and one dependent variable).
Implementation
Importing libraries
import pandas as pd
import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
Load the data and split X and y
data=pd.read_csv('Salesdata.csv')
X = data[['TV', 'Radio', 'Newspaper']]
y = data.Sales
Train and Test Split
X_train, X_test, y_train, y_test = train_test_split(X, y,test_size=0.30, random_state = 1)
Fit and Run the Model
lr = LinearRegression()
lr.fit(X_train, y_train)
lr_preds = lr.predict(X_test)
RMSE and R²
print("RMSE :", np.sqrt(mean_squared_error(y_test, lr_preds)))
print("R^2: ", r2_score(y_test, lr_preds))
Prediction with custom value
#Prediction for TV = 121, Radio = 8.4, Newspaper = 48.7
lr.predict([[121,8.4,48.7]])
Check out my other blogs...
Have doubts? Need help? Contact me!
LinkedIn: https://www.linkedin.com/in/dharmaraj-d-1b707898
Github: https://github.com/DharmarajPi