A Comprehensive Guide to Hyperparameter Tuning

SHRINATH BHAT
6 min readJan 14, 2023

--

Hyperparameter tuning is the process of finding the optimal set of hyperparameters for a machine learning model. It is important to tune the hyperparameters of a model to get the best performance on the task at hand.

You can use several methods to tune the hyperparameters. Some common methods include:

1.Grid search: In grid search, you specify a grid of hyperparameter values, and the model is trained and evaluated using all possible combinations of these values. Grid search can be time-consuming, as it requires training and evaluating the model multiple times. To use these methods to tune the hyperparameters of a random forest regressor, you will need to use a machine learning library that implements them, such as scikit-learn. Here is an example of how to use the GridSearchCV function from scikit-learn to tune the hyperparameters of a random forest regressor using grid search:

# Import required libraries
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import GridSearchCV

# Load the data
X, y = load_data()

# Create the regressor
regressor = RandomForestRegressor()

# Define the hyperparameter grid
param_grid = {
"n_estimators": [10, 50, 100],
"max_depth": [None, 5, 10],
"min_samples_split": [2, 5, 10],
"min_samples_leaf": [1, 2, 4]
}

# Create the grid search object
grid_search = GridSearchCV(regressor, param_grid, cv=5)

# Fit the grid search object to the data
grid_search.fit(X, y)

# Print the best parameters
print(grid_search.best_params_)

This code creates a grid search object and fits it to the data using the fit method. The best hyperparameters are then printed using the best_params_ attribute.

2. Random search: In random search, you specify a distribution of hyperparameter values, and a set of random combinations of these values are sampled and used to train and evaluate the model. Random search is often faster than grid search, as it does not evaluate all possible combinations of hyperparameters. Here’s an example of how to use random search for hyperparameter tuning of a Random Forest classifier in Python:

from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import RandomizedSearchCV
import numpy as np

# load the dataset
iris = load_iris()
X, y = iris.data, iris.target

# define the hyperparameter space
param_dist = {"n_estimators": np.arange(50,150),
"max_depth": np.arange(2, 10),
"min_samples_split": np.arange(2, 10),
"min_samples_leaf": np.arange(1, 10),
"criterion": ["gini", "entropy"]}

# instantiate the Random Forest classifier
clf = RandomForestClassifier()

# create the random search object
random_search = RandomizedSearchCV(clf, param_distributions=param_dist,
n_iter=50, cv=5, n_jobs=-1)

# fit the random search object to the data
random_search.fit(X, y)

# get the best set of hyperparameters
best_params = random_search.best_params_

# train the model using the best set of hyperparameters
best_clf = RandomForestClassifier(**best_params)
best_clf.fit(X, y)

3. Bayesian optimization: Bayesian optimization is a method of hyperparameter tuning that uses a probabilistic model to guide the search for the best set of hyperparameters for a given model. Here’s an example of how to use Bayesian optimization for hyperparameter tuning of a Random Forest classifier in Python using the “scikit-optimize” library:

import numpy as np
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import cross_val_score
from skopt import BayesSearchCV

iris = load_iris()
X, y = iris.data, iris.target
clf = RandomForestClassifier()

# define the hyperparameter search space
search_spaces = {'n_estimators': (50, 500),
'criterion': ['gini', 'entropy'],
'max_depth': (2, 20),
'min_samples_split': (2, 20),
'min_samples_leaf': (1, 20)}

# perform the bayesian optimization
optimizer = BayesSearchCV(clf, search_spaces, n_iter=50, cv=5, n_jobs=-1)
optimizer.fit(X, y)

# get the best set of hyperparameters
best_params = optimizer.best_params_

# train the model using the best set of hyperparameters
best_clf = RandomForestClassifier(**best_params)
best_clf.fit(X, y)

In this example, the Iris dataset is loaded and the Random Forest classifier is defined. The BayesSearchCV object is then defined with the hyperparameter search space, the number of iterations, the number of cross-validation splits, and the number of parallel jobs to run. The optimizer is then fitted to the data and the best set of hyperparameters is obtained. Finally, the model is trained with the best set of hyperparameters.

4. Gaussian Process (GP): Gaussian Process (GP) is a method of hyperparameter tuning that uses a probabilistic model to guide the search for the best set of hyperparameters for a given model. Here’s an example of how to use Gaussian Process for hyperparameter tuning of a Random Forest classifier in Python using the “scikit-optimize” library:

import numpy as np
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import cross_val_score
from skopt import gp_minimize

iris = load_iris()
X, y = iris.data, iris.target
clf = RandomForestClassifier()

# define the hyperparameter search space
space = [
(50, 500), # n_estimators
['gini', 'entropy'], # criterion
(2, 20), # max_depth
(2, 20), # min_samples_split
(1, 20) # min_samples_leaf
]

# define the function to optimize
def func(params):
n_estimators, criterion, max_depth, min_samples_split,
min_samples_leaf = params
clf.set_params(n_estimators=n_estimators, criterion=criterion,
max_depth=max_depth, min_samples_split=min_samples_split,
min_samples_leaf=min_samples_leaf)
return -np.mean(cross_val_score(clf, X, y, cv=5, n_jobs=-1, scoring="accuracy"))

# perform the gaussian process optimization
res = gp_minimize(func, space, n_calls=50, n_random_starts=10, random_state=0)

# get the best set of hyperparameters
best_params = res.x

# train the model using the best set of hyperparameters
best_clf = RandomForestClassifier(n_estimators=best_params[0], criterion=best_params[1],
max_depth=best_params[2], min_samples_split=best_params[3], min_samples_leaf=best_params[4])
best_clf.fit(X, y)

In this example, the Iris dataset is loaded and the Random Forest classifier is defined. The gp_minimize function is then used to perform the Gaussian Process optimization, which is defined by the objective function, func, and the search space. The function to optimize is the cross-validated accuracy of the model and the number of calls and random starts are set to 50 and 10 respectively. The res object contains the results of the optimization, including the best set of hyperparameters, which are then used to train the final model.

5. Tree-Parzen Estimator (TPE): Tree-Parzen Estimator (TPE) is a method of hyperparameter tuning that uses a probabilistic model to guide the search for the best set of hyperparameters for a given model. Here’s an example of how to use TPE for hyperparameter tuning of a Random Forest classifier in Python using the “hyperopt” library:

from hyperopt import fmin, tpe, hp
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import cross_val_score

iris = load_iris()
X, y = iris.data, iris.target

# define the hyperparameter search space
space = hp.choice('classifier_type', [
{
'type': 'RandomForest',
'n_estimators': hp.quniform('n_estimators', 50, 500, 1),
'criterion': hp.choice('criterion', ['gini', 'entropy']),
'max_depth': hp.quniform('max_depth', 2, 20, 1),
'min_samples_split': hp.quniform('min_samples_split', 2, 20, 1),
'min_samples_leaf': hp.quniform('min_samples_leaf', 1, 20, 1)
}])

# define the objective function
def objective(params):
clf = RandomForestClassifier(n_estimators=params['n_estimators'],
criterion=params['criterion'], max_depth=params['max_depth'],
min_samples_split=params['min_samples_split'],
min_samples_leaf=params['min_samples_leaf'])
score = cross_val_score(clf, X, y, cv=5, n_jobs=-1, scoring="accuracy")
return -np.mean(score)

# perform the TPE optimization
best = fmin(objective, space=space, algo=tpe.suggest, max_evals=50, verbose=1)

# get the best set of hyperparameters
best_params = best

# train the model using the best set of hyperparameters
best_clf = RandomForestClassifier(n_estimators=best_params['n_estimators'],
criterion=best_params['criterion'], max_depth=best_params['max_depth'],
min_samples_split=best_params['min_samples_split'],
min_samples_leaf=best_params['min_samples_leaf'])
best_clf.fit(X, y)

In this example, the Iris dataset is loaded and the Random Forest classifier is defined. The fmin function is used to perform the TPE optimization, which is defined by the objective function, objective, and the search space. The function to optimize is the cross-validated accuracy of the model and the number of evaluations is set to 50. The best object contains the results of the optimization, including the best set of hyperparameters, which are then used to train the final model.

6. Covariance Matrix Adaptation Evolutionary Strategy (CMA-ES): Covariance Matrix Adaptation Evolutionary Strategy (CMA-ES) is a method of hyperparameter tuning that uses an evolutionary algorithm to guide the search for the best set of hyperparameters for a given model. Here’s an example of how to use CMA-ES for hyperparameter tuning of a Random Forest classifier in Python using the “cma” library:

import numpy as np
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import cross_val_score
import cma

iris = load_iris()
X, y = iris.data, iris.target
clf = RandomForestClassifier()

# define the function to optimize
def func(params):
n_estimators, criterion, max_depth, min_samples_split, min_samples_leaf = params
clf.set_params(n_estimators=int(n_estimators), criterion=criterion, max_depth=int(max_depth),
min_samples_split=int(min_samples_split), min_samples_leaf=int(min_samples_leaf))
return -np.mean(cross_val_score(clf, X, y, cv=5, n_jobs=-1, scoring="accuracy"))

# starting point
x0 = [100, 'gini', 10, 10, 10]

# perform the CMA-ES optimization
res = cma.fmin(func, x0, 0.5, options={'bounds': [[50, 500],
['gini', 'entropy'], [2, 20], [2, 20], [1, 20]], 'verbose': -9})

# get the best set of hyperparameters
best_params = res[0]

# train the model using the best set of hyperparameters
best_clf = RandomForestClassifier(n_estimators=int(best_params[0]),
criterion=best_params[1], max_depth=int(best_params[2]),
min_samples_split=int(best_params[3]), min_samples_leaf=int(best_params[4]))
best_clf.fit(X, y)

Gaussian Process optimization, Tree-Parzen Estimator (TPE), and Covariance Matrix Adaptation Evolutionary Strategy (CMA-ES) can be computationally expensive, especially for large datasets and high-dimensional search spaces. In such cases, using a more efficient optimization algorithm or reducing the search space size may be necessary.

Reference: This article is written with the help of ChatGPT Open AI

--

--

SHRINATH BHAT

IIT Madras 2020 graduate. Senior Data Scientist at AB Inbev with 3 years of versatile experience in Machine Learning, Advanced Analytics & Al