# ML Scikit learn : Regressions

Regression investigates the relationship between the dependent variable *(output)* and independent variable* (input)*. This model falls into continuous supervised learning category in Machine Learning. Hence using regression is the best choice to build a model where output is continuous instead of discreet values.

**Insight on Linear regression:**

Linear regression tries to predict the data by following the below steps:

- It will try to fit a best line between input and output data set, so that it covers almost all the points. (
*y = mx + c*where m is slope and c is intercept). - While fitting the data it tries to minimize the
*sum of*. This can be done either by using Ordinary least square method (scikit learn) or gradient descent.*square error (Actual value — predicted value)* - After fitting the line it will have the values of m and c.
- Then with the help of slope ‘m’ it will detect the future output of new input.
- Intercept ‘c’ decides where the line passes through or intercepted. If c = 0 then line passes through origin.

Example:

Lets train and predict the net worth of a person using their age. The following is just an assumption not a real data set.

**Roll up your sleeves for Scikit implementation:**

Let’s get started !!!

# It is a good practice to split 70% of our data as training set and use the remaining 30% data to check our model accuracy.

parameters = [ input_train, output_train, input_test, output_test ]

fromsklearn.linear_modelimportLinearRegression

model = LinearRegression()

# Training or fitting the data

model.fit(input_train, output_train)

# Prediction

model.predict(input_test)

# Getting slope

print(model.coef_)

# Getting Intercept

print(model.intercept_)

# Finding Accuracy

model.score(input_test,output_test)

#Finding R square score

model.r2_score(input_test,output_test)

The ‘**R squared score’ **varies from 0 to 1. Hence if it is anywhere near to 1, we can conclude that our model is doing great.

I used a simple one variable linear regression to explain regression, but the same holds to Multi-variate linear regression as well, just that some extra features/variables will be available to predict output.