A busy street in Tokyo, Japan. Finding English speaking person here, is as hard as finding best hyperparameters for boosting algorithms. 😉

Hyperparameters Optimization for LightGBM, CatBoost and XGBoost Regressors using Bayesian Optimization.

How to optimize hyperparameters of boosting machine learning algorithms with Bayesian Optimization?

Dayal Chand Aichara
Analytics Vidhya
Published in
5 min readAug 16, 2019

--

Boosting machine learning algorithms are highly used because they give better accuracy over simple ones. Performance of these algorithms depends on hyperparameters. An optimal set of parameters can help to achieve higher accuracy. Finding hyperparameters manually is tedious and computationally expensive. Therefore, automation of hyperparameters tuning is important. RandomSearch, GridSearchCV, and Bayesian optimization are generally used to optimize hyperparameters. Bayesian optimization gives better and fast results compare to other methods.

How does Bayesian optimization work?

  1. Build a surrogate probability model of the objective function
  2. Find the hyperparameters that perform best on the surrogate
  3. Apply these hyperparameters to the true objective function
  4. Update the surrogate model incorporating the new results
  5. Repeat steps 2–4 until max iterations or time is reached

Bayesian optimizer build a probability model of the a given objective function and use it to select the most promising hyperparameters to evaluate in the true objective function. If you want to study in deep then read here and here.

How to optimize hyperparameters with Bayesian optimization?

I will use bayesian-optimization python package to demonstrate application of Bayesian model based optimization. Install bayesian-optimization python package via pip .

pip install bayesian-optimization

Hyperparameters optimization process can be done in 3 parts.

Part 1 — Define objective function

Define an objective function which takes hyperparameters as input and gives a score as output which has be maximize or minimize.

Part 2 — Define search space of hyperparameters

Define a range of hyperparameters to optimize. Keep the parameter range narrow for better results.

Part 3 — Define a surrogate model of the objective function and call it.

Make a Bayesian optimization function and call it to maximize objective output. Bayesian optimization function takes 3 inputs: Objective Function , Search Space , and random_state .

Let’s implement Bayesian optimization for boosting machine learning algorithms for regression purpose.

Import libraries and load data.

I will use Boston Housing data for this tutorial.

First 5 rows of Boston Housing data.

Our data has 13 predictor variables (independent variables ) and Price as criterion variable (dependent variable).

1. LightGBM Regressor

a. Objective Function

Objective function will return negative of l1 (absolute loss, alias=mean_absolute_error, mae). Objective will be to miximize output of objective function. You can use l2 , l2_root , poisson also instead of l1 .

Note:

LightGBM and XGBoost don’t have R-Squared metric. If you want to use R2 metric instead of other evaluation metrics, then write your own R2 metric.

See an example of objective function with R2 metric.

b. Search Space

Define range of input parameters of objective function. You can define number of input parameters based on how many hyperparameters you want to optimize. This example has 6 hyperparameters.

c. Surrogate Model and Optimization

Define a Bayesian optimization function and maximize the output of objective function. Sum of init_points and n_iter is equal to total number of optimization rounds.

Put all together in a single function.

Output of above code will be table which has output of objective function as target and values of input parameters to objective function. To get best parameters use obtimizer.max['params'] .

Hyperparameters optimization results table of LightGBM Regressor

2. Catboost Regressor

a. Objective Function

Objective function takes two inputs : depth and bagging_temperature . Objective function will return maximum mean R-squared value on test.

b. Search Space

Objective function has only two input parameters, therefore search space will also have only 2 parameters.

c. Surrogate Model and Optimization

Bayesian optimizer will optimize depth and bagging_temperature to miximize R2 value.

Hyperparameters optimization results table for CatBoost Regressor

3. XGBoost Regressor

a. Objective Function

Objective function gives maximum value of r2 for input parameters.

Note:

  1. If eval_metric is included in parameters , then use early_stopping_rounds smaller number (10 or less ). Why? Because if the value of evaluation metric doesn’t improve for given early stopping rounds then training will stop.
  2. If eval_metric is not defined in parameters, then use early_stopping_rounds larger number but less than num_boost_rounds. Why? Because training will stop at given early stoping round.

b. Search Space

c. Surrogate Model and Optimization

Hyperparameters optimization results table of XGBoost Regressor

I hope, you have learned whole concept of hyperparameters optimization with Bayesian optimization. Hyperparameters tuning seems easy now. Right?

Conclusion

It is easy to optimize hyperparameters with Bayesian Optimization . LightGBM and XGBoost don’t have r2 metric, therefore we should define own r2 metric . There is little difference in r2 metric for LightGBM and XGBoost. LightGBM R2 metric should return 3 outputs, whereas XGBoost R2 metric should return 2 outputs.

We can use different evaluation metrics based on model requirement. Keep the search space parameters range narrow for better results. bayesian-optimization maximize the output of objective function, therefore output must be negative for l1 & l2 , and positive for r2 .

Check out Notebook on Github or Colab Notebook to see use cases. Reach out to me on LinkedIn if you have any query. Happy Parameter Tuning! Thank You for reading..! ☺️

References:

  1. https://towardsdatascience.com/a-conceptual-explanation-of-bayesian-model-based-hyperparameter-optimization-for-machine-learning-b8172278050f
  2. https://towardsdatascience.com/an-introductory-example-of-bayesian-optimization-in-python-with-hyperopt-aae40fff4ffo
  3. https://medium.com/spikelab/hyperparameter-optimization-using-bayesian-optimization-f1f393dcd36d
  4. https://www.kaggle.com/omarito/xgboost-bayesianoptimization
  5. https://github.com/fmfn/BayesianOptimization

--

--

Dayal Chand Aichara
Analytics Vidhya

Data Scientist at KPMG Ignition Tokyo , Blockchain Enthusiast, Traveller, Trekker — https://www.linkedin.com/in/dcaichara/