Introduction to Machine Learning: Regression

Esma Bozkurt
Analytics Vidhya
Published in
5 min readMar 11, 2021

Hello, first of all today’s topic is Machine Learning and Algorithms. But before starting this, I would like to touch upon the relationship between statistical learning as well. Then prepare your coffee because I’m going to tell a lot today!

Most of the learning models were born with statistics, and the optimization and prediction success of these models were advanced in computer science. What a wonderful combination. So, what are their differences in technical terms?

Interpretation can be made about the expected changes in the response variable of the changes in the independent variables used to establish a model in statistical learning. In machine learning approaches, on the other hand, assumptions are less, models are easier to use and result-oriented. These are like two close friends that we cannot separate from each other. Now that we understand the difference, we can focus on Machine Learning.

Machine Learning

There are three types of Machine Learning Supervised, Unsupervised and Reinforcement Learning.

  • Supervised Learning (Labeled Input)
  • Unsupervised Learning (Unlabeled Input)
  • Reinforcement Learning

In supervised learning, input are labeled and output is known. But in unsupervised learning, the input is unlabeled. We expect the machine discovers hidden patterns and trends in the data. In Reinforcement Learning, it is the discovery of the machine with actions without input and output. It works with a system that gives rewards for righteous and punishments for wrongs. Autonomous vehicles are the best example of this.

There are two types of Supervised Learning, Regression and Classification. I will explain Regression in this article. You can find Classification in my next article. (Regression has many types, and there are also types that fall into the classification. Do not confuse this with Linear Regression.) While we use regression to predict by establishing a linear relationship between variables (such as predicting price a car), we use classification for problems where we will get yes-no answers (such as spam e-mail or not).

Regression

Simple Linear Regression examines the target variable against one other variable, while Multiple Linear Regression examines more than one other variable for the target variable. For example, let’s first examine the target variable according to a one variable:

We talked about a linear relationship, that is, the change of y depending on the x in y = ax + b. Let’s show this on the graph:

However, it is rare for a dependent variable to be explained by only one variable. In this case, We use multiple regression that tries to explain a dependent variable using multiple independent variables. Multiple regressions can be linear and nonlinear. Multiple linear regression (MLR) is used to determine a mathematical relationship between a set of random variables. In other words, MLR examines how multiple independent variables relate to a dependent variable (y= b0 + a x1 + bx2 + c a x3 + d). Now let’s look at the relationship of our target variable with all other variables except itself:

Our success rate has increased.

Other algorithms used to analyze multivariate regression data are:

Ridge Regression(L2 regularization), Lasso Regression (L1 regularization), Elastic Net

The aim is to find the coefficients that minimize the error sum of squares by applying a penalty to these coefficients. It is resistant to over-fitting. It sets up a model with all variables and brings the coefficients of unrelated variables closer to zero. Unlike the Ridge regression, it equals the coefficients of unrelated variables to zero. It makes both variable selection and regularization to increase the predictive accuracy and interpretability of the model it produces. Elastic Net, on the other hand, offers a model that combines the good sides of the two.

Let me explain what the r2 and mean squared error are here.

R square measures the rate of variation of the dependent variable according to the independent variables in the model, that is, how much of the variability in the dependent variable can be explained by the model. It is the square of the correlation coefficient.

Mean Square Error gives you an absolute number of how different your predicted results differ from the actual number. It gives an actual number to compare with other model results and helps you choose the best regression model.

Model Evaluation

  • The data we use is usually split into training data and test data. The training set contains a known output and the model learns on this data in order to be generalized to other data later on. We have the test dataset (or subset) in order to test our model’s prediction on this subset.
  • Cross Validation is very similar to train/test split, but it’s applied to more subsets. Meaning, we split our data into k subsets, and train on k-1 one of those subset. What we do is to hold the last subset for test. We’re able to do it for each of the subsets. In K-Folds Cross Validation we split our data into k different subsets (or folds). We use k-1 subsets to train our data and leave the last subset (or the last fold) as test data.

From now on, we will separate the data this way before we make predictions.

I will explain classification in my next article, my goal is to explain machine learning to you step by step. If you like it, follow me, have a nice day! :)

github account/esmabozkurt

--

--