Tuning Model Hyperparameters With Random Search

A Comparison with Grid Search

M. Hammad Hassan
6 min readNov 25, 2023

If you don’t already know, I have recently posted about Grid Search being a valuable tool for hyperparameter tuning. But as every other tool, it has its drawbacks.

TL;DR

Random Search is a hyperparameter optimization technique in machine learning that randomly samples a defined number of hyperparameter combinations from specified distributions. By distributions, I mean sets or ranges of possible parameter values.

It efficiently explores a subset of the hyperparameter space, making it faster and more resource-efficient than exhaustive methods like grid search.

While it might not guarantee finding the absolute best hyperparameters, it effectively discovers good settings, especially in scenarios with large or complex parameter spaces or sets are considered.

Here I will introduce you to Random Search, tell you how it works and how it compares to grid search.

Photo by Hans-Peter Gauster on Unsplash

What is it?

Random or Randomized Search is a searching algorithm which is modified to work with different sets of hyperparameter values in a machine learning model.

Hyperparameters are settings that aren’t learned during the training process but affect the behavior and performance of a machine learning model.

Minor Throwback To Grid Search?

In my previously related article, where I explain about Using Grid Search For Hyper-Parameter Tuning, we were giving the algorithm a list of values to select from (generally called a grid of values), and it would try every possible combination of those values with our model to find out the optimum settings (hyperparameter).

How Does It Work?

Randomized Search, as the name suggests, picks a parameter value at random from a range or list of possible values for a specific hyperparameter.

A little disclaimer … I will be demonstrating this with RandomForestClassifier() which can be obtained from sklearn.ensemble. The data I’m going to be using with this model is none other than the iris dataset obtainable from sklearn.datasets as load_iris.

📌 Also, make sure you have split the dataset into training and testing sets before you begin. You will soon know why you need this.

from sklearn.datasets import load_iris
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split

Step 1: Define The Parameter Space Or Distribution

To define a distribution, we only a python dictionary which contains the parameter name as the KEY and the range of possible values as the VALUE for those keys.

Pretty easy start, huh? Take your time reading the comments below:

# Define hyperparameter distributions
param_dist = {
'n_estimators': [int(x) for x in np.linspace(start=200, stop=2000, num=10)], # Number of trees
'max_features': ['auto', 'sqrt'], # Number of features to consider for best split
'max_depth': [int(x) for x in np.linspace(10, 110, num=11)], # Maximum depth of the trees
'min_samples_split': [2, 5, 10], # Minimum samples required to split a node
'min_samples_leaf': [1, 2, 4], # Minimum samples required at each leaf node
'bootstrap': [True, False] # Whether bootstrap samples are used when building trees
}

For the range of parameter values, you will need to use python lists. Since these are lists, we can also use list comprehensions to help us create a lengthy list, just like I have done with list for n_estimators.

📌 Please take note that I’m using numpy here in some lists. You are free to use whichever methods you like. But if you are going to follow along, you will need numpy installed.

Step 2: Prime The Random Search Object

We can get the RandomizedSearchCV object from the same module we got our train_test_split from.

from sklearn.model_selection import RandomizedSearchCV

Note that the name of the Class we are importing has a CV in the end. This means that it comes pre-packed with Cross-Validation, just like Grid Search. And that is why we needed train_test_split to slice our dataset into training and testing sets.

If you are not familiar with the cross-validation technique, you basically split your dataset into multiple subsets (folds), train the model on some of them, and evaluate it on others. This helps ensure that the model’s performance is robust and not just tailored to the training data.

# Create a Random Forest Classifier
rf_classifier = RandomForestClassifier()

# Create the RandomizedSearchCV object
random_search = RandomizedSearchCV(
estimator=rf_classifier,
param_distributions=param_dist,
n_iter=100, # Number of parameter settings that are sampled
cv=5, # Number of cross-validation folds
verbose=2,
scoring="accuracy",
random_state=42,
n_jobs=-1 # Use all available processors
)

There are a lot of things going in here, so let’s walk through this slowly.

RandomizedSearchCV takes an argument named estimator which is the model we want to use.

param_distributions is the dictionary we prepared earlier.

n_iter signifies the number of iterations to run. The higher the value, the better the results are, but it will also take more time as well. There is a trade-off here, so you should plan on using an optimum and acceptable value. Default is 10.

cv represents the number of folds we should perform for cross-validation. If nothing is provided, it will be set to a default value, which is 5.

Adding a verbose argument will show you the progress of the search being performed. It accepts an integer value from 1 to 3. It will give you more detailed logs of what is going on. Here is what each one does:

  • 1 👉 the computation time for each fold and parameter candidate is displayed
  • 2 👉 the score is also displayed
  • 3 👉 the fold and candidate parameter indexes are also displayed together with the starting time of the computation

scoring is the metric which will be considered for determining the best results. If this is not provided, the default will be selected based on the estimator you have selected. In case of Random Forest Classifier, it is “accuracy”.

random_state is for random, uniform sampling from lists of possible values. It accepts a random integer.

n_jobs represents the number of jobs to run in parallel. Since this is an iteration based process, we can allow it to run some jobs in parallel to cut down on the processing time. -1 means use the maximum number of jobs that can be run on the current processor. Default is 1.

To keep it from getting more complex than it already is, I have skipped some arguments that RandomizedSearchCV can take. But know this:

  • refit allows you to define your custom scoring function.
  • pre_dispatch controls the number of jobs that get dispatched during parallel execution.
  • return_train_score accepts a boolean which represent whether the training score will be calculated or not. This can cut down on the training time.

📌 You can learn more about these arguments in SKLearn’s Official Docs.

Step 3: Fit The Data & Extract The Results

The hard part is over. Now we will simply fit the model with the training dataset.

# Fit the RandomizedSearchCV instance
random_search.fit(X_train, y_train)

Let’s get the results …

# Retrieve the best parameters found by RandomizedSearchCV
best_params = random_search.best_params_
print("Best Parameters:", best_params)

# Evaluate the model with best parameters on the test set
best_estimator = random_search.best_estimator_
test_accuracy = best_estimator.score(X_test, y_test)
print("Test Accuracy with Best Parameters:", test_accuracy)

best_params_ is a dictionary which may look like this:

{
'n_estimators': 2000,
'min_samples_split': 10,
'min_samples_leaf': 2,
'max_features': 'sqrt',
'max_depth': 10,
'bootstrap': True
}

Notice it is using the same name we used while defining param_dist.

📌 Unlike in Grid Search, where I showed you a heatmap with all the values of the grid, I won’t be plotting any charts here. This is simply because Random Search is not systematic like Grid Search, so plotting a bunch of RANDOM values will not essentially tell us anything useful.

Thank you for reading till the end 🥳

Now you are able to find the best parameters for your model more quickly and efficiently … in a more processor friendly manner than the Grid Search.

If you found this article helpful and learned something new, be sure to leave a 👏 or let me know 📣 your thoughts on it. I’d love to listen to your feedback.

--

--

M. Hammad Hassan

Hey there! I'm a Data Scientist and a Full Stack Developer.