Hyperparameters Optimization for LightGBM, CatBoost and XGBoost Regressors using Bayesian Optimization.
How to optimize hyperparameters of boosting machine learning algorithms with Bayesian Optimization?
Boosting machine learning algorithms are highly used because they give better accuracy over simple ones. Performance of these algorithms depends on hyperparameters. An optimal set of parameters can help to achieve higher accuracy. Finding hyperparameters manually is tedious and computationally expensive. Therefore, automation of hyperparameters tuning is important. RandomSearch, GridSearchCV, and Bayesian optimization are generally used to optimize hyperparameters. Bayesian optimization gives better and fast results compare to other methods.
How does Bayesian optimization work?
- Build a surrogate probability model of the objective function
- Find the hyperparameters that perform best on the surrogate
- Apply these hyperparameters to the true objective function
- Update the surrogate model incorporating the new results
- Repeat steps 2–4 until max iterations or time is reached
Bayesian optimizer build a probability model of the a given objective function and use it to select the most promising hyperparameters to evaluate in the true objective function. If you want to study in deep then read here and here.
How to optimize hyperparameters with Bayesian optimization?
I will use bayesian-optimization python package to demonstrate application of Bayesian model based optimization. Install bayesian-optimization python package via pip .
pip install bayesian-optimization
Hyperparameters optimization process can be done in 3 parts.
Part 1 — Define objective function
Define an objective function which takes hyperparameters as input and gives a score as output which has be maximize or minimize.
Part 2 — Define search space of hyperparameters
Define a range of hyperparameters to optimize. Keep the parameter range narrow for better results.
Part 3 — Define a surrogate model of the objective function and call it.
Make a Bayesian optimization function and call it to maximize objective output. Bayesian optimization function takes 3 inputs: Objective Function
, Search Space
, and random_state
.
Let’s implement Bayesian optimization for boosting machine learning algorithms for regression purpose.
Import libraries and load data.
I will use Boston Housing data for this tutorial.
Our data has 13 predictor variables (independent variables ) and Price as criterion variable (dependent variable).
1. LightGBM Regressor
a. Objective Function
Objective function will return negative of l1
(absolute loss, alias=mean_absolute_error
, mae
). Objective will be to miximize output of objective function. You can use l2
, l2_root
, poisson
also instead of l1
.
Note:
LightGBM and XGBoost don’t have R-Squared metric. If you want to use R2 metric instead of other evaluation metrics, then write your own R2 metric.
See an example of objective function with R2 metric.
b. Search Space
Define range of input parameters of objective function. You can define number of input parameters based on how many hyperparameters you want to optimize. This example has 6 hyperparameters.
c. Surrogate Model and Optimization
Define a Bayesian optimization function and maximize the output of objective function. Sum of init_points
and n_iter
is equal to total number of optimization rounds.
Put all together in a single function.
Output of above code will be table which has output of objective function as target and values of input parameters to objective function. To get best parameters use obtimizer.max['params']
.
2. Catboost Regressor
a. Objective Function
Objective function takes two inputs : depth
and bagging_temperature
. Objective function will return maximum mean R-squared value on test.
b. Search Space
Objective function has only two input parameters, therefore search space will also have only 2 parameters.
c. Surrogate Model and Optimization
Bayesian optimizer will optimize depth
and bagging_temperature
to miximize R2
value.
3. XGBoost Regressor
a. Objective Function
Objective function gives maximum value of r2
for input parameters.
Note:
- If
eval_metric
is included in parameters , then useearly_stopping_rounds
smaller number (10 or less ). Why? Because if the value of evaluation metric doesn’t improve for given early stopping rounds then training will stop. - If
eval_metric
is not defined in parameters, then useearly_stopping_rounds
larger number but less thannum_boost_rounds
. Why? Because training will stop at given early stoping round.
b. Search Space
c. Surrogate Model and Optimization
I hope, you have learned whole concept of hyperparameters optimization with Bayesian optimization. Hyperparameters tuning seems easy now. Right?
Conclusion
It is easy to optimize hyperparameters with Bayesian Optimization
. LightGBM and XGBoost don’t have r2
metric, therefore we should define own r2 metric
. There is little difference in r2
metric for LightGBM and XGBoost. LightGBM R2 metric should return 3 outputs, whereas XGBoost R2 metric should return 2 outputs.
We can use different evaluation metrics based on model requirement. Keep the search space parameters range narrow for better results. bayesian-optimization
maximize the output of objective function, therefore output must be negative for l1
& l2
, and positive for r2
.
Check out Notebook on Github or Colab Notebook to see use cases. Reach out to me on LinkedIn if you have any query. Happy Parameter Tuning! Thank You for reading..! ☺️
References:
- https://towardsdatascience.com/a-conceptual-explanation-of-bayesian-model-based-hyperparameter-optimization-for-machine-learning-b8172278050f
- https://towardsdatascience.com/an-introductory-example-of-bayesian-optimization-in-python-with-hyperopt-aae40fff4ffo
- https://medium.com/spikelab/hyperparameter-optimization-using-bayesian-optimization-f1f393dcd36d
- https://www.kaggle.com/omarito/xgboost-bayesianoptimization
- https://github.com/fmfn/BayesianOptimization