Understand any Machine Learning models with these 3 core pillars

hqtquynhtram
3 min readOct 16, 2021

--

You are a beginner in Machine Learning.

You have been watching a couple of courses but still getting confused how ML models work under the hood.

This article is for you! By getting started with 3 pillars below and apply it with any ML models such as Linear Regression, Logistic Regression, Support Vector Machine, Xgboost, etc, you will understand them much more easier and systematically.

Let me explain again what does each pillar mean first.

  1. Algorithm is form of our mathematical function which will be adjusted to be fitted with training data points
  2. Loss Function or Cost Function measures the quality of our fitted model. Loosely speaking, a loss function is needed to distinguish between good classifiers and bad classifiers.
  3. Optimizer to search the optimal point that help us get the best fitted model with the lowest cost

Loosely speaking, for any ML models, the way it works is to get its best fitted mathematical parameters which have the lowest loss value by using a optimizer.

If we change the form of algorithm, the form of loss function or the form of optimizer, we will end up getting a different ML models. That might help you realize different ML models just come up with different type of algorithm, loss function and optimizer.

For instance, let’s use Linear Regression model to illustrate those 3 pillars.

  • Algorithm
  • Loss function
  • Optimizer

I will show you how we can form get a new ML models by changing one of the 3 pillars.

  • Go with the loss function, I want to penalize variables having large weight W to avoid overfitting by adding the lambda parameter to the original loss function. See the image below.
As you can see, by add a new regularization term to the loss function, you create different forms of linear regression model which are often known as L1/L2 regularization.

As you can see, by add a new regularization term to the loss function, you create different forms of linear regression model which are often known as L1/L2 regularization.

  • Or if we go with the model’s algorithm form, by squaring a feature/ an independent variable, you will get a polynomial regression model instead of linear regression.

Now, you can challenge yourself by write down the equation of those 3 pillars for ML models you’ve learnt.

Happy learning!

--

--

hqtquynhtram

Passionate about answering questions with data, building AI product. Feel free to contact me via linkedin.com/in/tramdata to share interests on data, product