Linear Regression with Sklearn

Rohit Raj
Thrive in AI
Published in
3 min readJul 17, 2022

Linear regression analysis is used to predict a variable's value based on another variable's value. The variable you want to predict is called the dependent variable. The variable you are using to predict the other variable’s value is called the independent variable.

We can use the Sklearn library of python to perform linear regression in less than five lines of code.

First, we import the necessary libraries using the following code

Then we read the csv data using pandas library. We drop the columns which will not be part of our linear regression model

We can remove missing data using fillna function of pandas. This will replace missing values with the previous value of column.

We convert data into 2 dimensional using NumPy reshape function as linear regression model of sklearn requires two dimensional data.

Then we split data into train set and test set using sklearn library function.

Now we use sklearn library to determine the linear regression model for the data.

Fit method of linear regression model determines linear regression coefficients using train data. Then score method computes model accuracy using the test set of the data.

We can print coefficients of linear regression model using the following code

If we want to predict on new data, we can use predict method of linear regression model as follows

y_pred = regr.predict(X_test)

That’s it. You can use the above code to do linear regression on any data. If you liked my article then to read more such articles please like and subscribe.

--

--

Rohit Raj
Thrive in AI

Studied at IIT Madras and IIM Indore. Love Data Science