Pros and cons of common Machine Learning algorithms

eculidean
3 min readNov 19, 2019

--

When we start to learn any new machine learning algorithm, it always helps to know, the plus and minus of its predecessors, in order to strongly understand the new algorithm, so I’ve created a pros and cons list of common machine learning models.

Linear Regression

Pros

  1. Simple method
  2. Good interpretation
  3. Easy to implement

Cons

  1. Assumes linear relationship between dependent and independent variables, which is incorrect in most cases
  2. Sensitive to outliers
  3. If the number of observations are less, it leads to over fitting, it starts considering noise.

Ridge Regression

Pros

  1. Trades variance for bias (i.e. in presence of co-linearity, it is worth to have biased results, in order to lower the variance.)
  2. Prevents over fitting

Cons

  1. Increases bias
  2. Need to select perfect alpha (hyper parameter)
  3. Model interpret-ability is low

LASSO Regression

Pros

  1. Select features, by shrinking co-efficient towards zero.
  2. Avoids over fitting

Cons

  1. Selected features will be highly biased.
  2. For n<<p (n-number of data points, p-number of features), LASSO selects at most n features.
  3. LASSO will select only one feature from a group of correlated features, the selection is arbitrary in nature.
  4. For different boot strapped data, the feature selected can be very different.
  5. Prediction performance is worse than Ridge regression.

Elastic Net Regression

Pros

  1. Doesn’t have the problem of selecting more than n predictors when n<<p, whereas LASSO saturates when n<<p.

Cons

  1. Computationally more expensive than LASSO or Ridge.

Logistic Regression

Pros

  1. Doesn’t assume linear relationship between independent and dependent variables.
  2. Dependent variables does not need to be normally distributed.
  3. No homogeneity of variance assumption required.
  4. Effective interpretation of results.

Cons

  1. Requires more data to achieve stability.
  2. Effective mostly on linearly separable.

Decision tree

Pros

  1. Does not require standardization and normalization.
  2. Easy to implement
  3. Less data preparation work
  4. Missing values has no impact

Cons

  1. Doesn’t work for smooth boundaries
  2. Doesn’t work when variables are uncorrelated
  3. Due to greedy strategy, it has high variance
  4. Higher time to train the model
  5. Can become complex

KN Neighbors

Pros

  1. No training period
  2. Easy to implement
  3. New data can be added seamlessly, which will not impact the accuracy of the algorithm

Cons

  1. Does not work well with high dimensions
  2. Sensitive to noisy data, missing values and outliers
  3. Does not work well with large data sets, as the cost of calculating distance is huge
  4. Need feature scaling

Random Forest

Pros

  1. One third of data is not used for training, hence it can be used for testing.
  2. High performance and accurate
  3. Provides feature importance estimate
  4. Can automatically handle missing values
  5. No feature scaling is required

Cons

  1. Less interpret-ability, black box approach
  2. Can over fit the data.
  3. Requires more computational resources
  4. Prediction time is high

We cannot discriminate against machine learning models, based on pros and cons. Selection of machine learning model, is based on the business use case, that we choose to solve, No free lunch theorem. This comparison will give you some idea about the reasons for using different models for our data set.

Here, I’ve haven’t said anything about Boosting methods. We have various boosting methods such as Gradient boosting for reducing both bias and variance, Extreme Gradient boosting for utilizing your GPU to the fullest, Light GBM, Cat Boosting for handling categorical variables and preventing over fit. And recently, Stanford university released new algorithm called Natural Gradient boosting.

In the next story, we can compare various booting algorithms.

--

--