Hyper Parameter Tuning (GridSearchCV Vs RandomizedSearchCV)

Published in

Analytics Vidhya

4 min readDec 22, 2020

Quite often data scientists deal with hyper parameter tuning in their day-to-day machine learning implementations. So what are hyper parameters and why do we need them? We will be discussing more on the two main types of Hyper parameter tuning, i.e., Grid Search CV and Randomized Search CV.

What are Hyper Parameters?
Hyper parameters are more like handles available to control the output or the behavior of the algorithm used for modeling. They can be supplied to algorithms as arguments. For eg: model= DecisionTreeClassifier(criterion=’entropy’), here the criterion entropy is the hyper parameter passed.

The function get_params() is the function used to get a list of all the hyper parameters for any algorithm.

When hyper parameters are not given to an algorithm, default values are picked to run the model. This makes hyper parameter tuning one of the critical steps involved in machine learning implementation.

Steps involved in hyper parameter tuning

Choose the appropriate algorithm for the model
Decide the parameter space
Decide the method for searching parameter space
Decide the cross-validation method
Decide the score metrics to evaluate your model

In order to search the best values in hyper parameter space, we can use

GridSearchCV (considers all possible combinations of hyper parameters)
RandomizedSearchCV (only few samples are randomly selected)

Cross-validation is a resampling procedure used to evaluate machine learning models. This method has a single parameter k which refers to the number of partitions the given data sample is to be split into. So, they are often called k-fold cross-validation. The data is divided into training, validating and testing set to prevent data leaks. So the testing set should only be transformed after model is fit using the train and validation set. Each time the model fits the train data they are evaluate with the test data and the average of evaluation score is used to analyze the overall model.

GridSearchCV

Grid Search is one of the most basic hyper parameter technique used and so their implementation is quite simple. All possible permutations of the hyper parameters for a particular model are used to build models. The performance of each model is evaluated and the best performing one is selected. Since GridSearchCV uses each and every combination to build and evaluate the model performance, this method is highly computational expensive. The python implementation of GridSearchCV for Random Forest algorithm is as below.

# Run GridSearch to tune the hyper-parameter
from sklearn.model_selection import GridSearchCV
rfr=RandomForestRegressor()
k_fold_cv = 5 # Stratified 5-fold cross validation
grid_params = {
 “n_estimators” : [10,50,100],
 “max_features” : [“auto”, “log2”, “sqrt”],
 “bootstrap” : [True, False]
 }
grid = GridSearchCV(rfr, param_grid=grid_params, cv=k_fold_cv, 
 n_jobs = 1, verbose = 0, return_train_score=True)
grid.fit(X_train, y_train)
print(‘Best hyper parameter:’, grid.best_params_)

sklearn.model_selection.GridSearchCV - scikit-learn 0.24.0 documentation

Exhaustive search over specified parameter values for an estimator. Important members are fit, predict. GridSearchCV…

scikit-learn.org

If you notice the grid_params, there are three values each for the hyper parameter n_estimators and max_features. So, there will be 3 x 3 = 9 combinations of these two hyper parameter alone.

So all the permutations of the hyper parameters will generate a huge number of models and as the data size increases the computational speed drastically drops. This is why data scientists prefer RandomizedSearchCV over GridSearchCV while dealing with huge data sets.

RandomizedSearchCV

In randomizedsearchcv, instead of providing a discrete set of values to explore on each hyperparameter, we provide a statistical distribution or list of hyper parameters. Values for the different hyper parameters are picked up at random from this distribution. The python implementation of GridSearchCV for Random Forest algorithm is as below.

# Run RandomizedSearchCV to tune the hyper-parameter
from sklearn.model_selection import RandomizedSearchCV
rfr=RandomForestRegressor()
k_fold_cv = 5 # Stratified 5-fold cross validation
params = {
 “n_estimators” : [10,50,100],
 “max_features” : [“auto”, “log2”, “sqrt”],
 “bootstrap” : [True, False]
 }
random = RandomizedSearchCV(rfr, param_distributions=params, cv=k_fold_cv,
 n_iter = 5, scoring=’neg_mean_absolute_error’,verbose=2, random_state=42,
 n_jobs=-1, return_train_score=True)
random.fit(X_train, y_train)
print(‘Best hyper parameter:’, random.best_params_)

sklearn.model_selection.RandomizedSearchCV - scikit-learn 0.24.0 documentation

Randomized search on hyper parameters. RandomizedSearchCV implements a "fit" and a "score" method. It also implements…

scikit-learn.org

We can conclude that the GridSearchCV is suitable only for small datasets. When it comes to larger dataset, RandomizedSearchCV outperforms GridSearchCV.

Hope you got some insights from the article. Follow for more!

Machine Learning Algorithms: Logistic Regression

One of the most famous definition by Tom Mitchell states machine learning as “a computer program of performance P is…

medium.com

Machine Learning Algorithms: Support Vector Machines

In this third article of the Machine Learning algorithms series, I will be discussing the most popular supervised…

medium.com

Machine Learning Algorithms: Naïve Bayes Classifier and KNN Classifier

In this second article of the Machine Learning algorithms, I will be focusing on the Naïve Bayes Classifier and KNN…

medium.com

Hyper Parameter Tuning (GridSearchCV Vs RandomizedSearchCV)

GridSearchCV

sklearn.model_selection.GridSearchCV - scikit-learn 0.24.0 documentation

Exhaustive search over specified parameter values for an estimator. Important members are fit, predict. GridSearchCV…

RandomizedSearchCV

sklearn.model_selection.RandomizedSearchCV - scikit-learn 0.24.0 documentation

Randomized search on hyper parameters. RandomizedSearchCV implements a "fit" and a "score" method. It also implements…

Machine Learning Algorithms: Logistic Regression

One of the most famous definition by Tom Mitchell states machine learning as “a computer program of performance P is…

Machine Learning Algorithms: Support Vector Machines

In this third article of the Machine Learning algorithms series, I will be discussing the most popular supervised…

Machine Learning Algorithms: Naïve Bayes Classifier and KNN Classifier

In this second article of the Machine Learning algorithms, I will be focusing on the Naïve Bayes Classifier and KNN…

Written by Vishnu Satheesh