Using Grid Search For Hyper-Parameter Tuning
Here we are going to explore an efficient way to tune our model’s hyperparameters using Grid Search. We would be exploring the facilities that it provides as well as run some scenarios with Python code.
TL;DR
In an inefficient scenario, you would manually test your model’s performance against different set of parameters. It would be a trial-and-error approach, without a doubt very slow.
Since some of us may have a significant affinity for coding, so we may try to automate this process a bit by running some For loops
here and there. But that would be computationally expensive and would run for as long as there are combinations to try.
Grid Search solves this problem by automating the process of parameter tuning. Instead of manually testing various combinations of parameters, Grid Search systematically explores a predefined set of parameter values, effectively creating a grid of possible configurations.
This method streamlines the optimization process by testing each combination automatically, saving time and effort. By evaluating the model’s performance across the grid, Grid Search helps identify the best parameter combination that optimizes the model’s performance, making the tuning process much more efficient and less prone to human error.
Also checkout Hyper Parameter Tuning With Random Search.
How does it work?
I will be explaining that with the help of some code, so that it is helpful for you to understand while in action.
Step 1: Define a hyperparameter grid
This is basically a python dictionary which contains configuration for the model we are targeting for tuning.
In our example we will be working with Support Vector Machines or SVM for short.
param_grid = {
'C': [0.1, 1, 10],
'kernel': ['linear', 'rbf', 'poly']
}
SVM takes two parameters named C
and kernel
, so we defined three different values for both of the parameters in an array. Of course, we could define more if we wanted.
C
here is known as the regularization parameter or the cost parameter, which controls the trade-off between maximizing the margin (distance between the decision boundary and the data points) and minimizing the classification error on the training data.
kernel
here represents the kernel typer used for this model. This is a common parameter which you will find in kernel-based models. It is used to specify the type of kernel function to be used when transforming the input data into a higher-dimensional space.
📌 It’s ok if it doesn’t make sense to you. We are here for GridSearch, not for SVM.
Step 2: Model Training and Evaluation
Grid Search typically uses cross-validation to evaluate model performance. You split your dataset into multiple subsets (folds), train the model on some of them, and evaluate it on others. This helps ensure that the model’s performance is robust and not just tailored to the training data.
But you don’t have to worry about implementing cross-validation separately. You will notice that you import Grid Search as GridSearchCV
that means that it comes with pre-packed with cross-validation functionality.
For the evaluation, you need to import only two things:
from sklearn.model_selection import GridSearchCV
from sklearn.svm import SVC
📌Here I’m importing
GridSearchCV
and the model I want to evaluate, which isSVC
.
Grid Search accepts several arguments, but for simplicity’s sake we will only use a few important ones.
Also, this is where you train_test_split
comes in handy, as you will be training your gridsearch
object on your training dataset.
grid_search = GridSearchCV(SVC(), param_grid, cv=5, scoring='accuracy')
grid_search.fit(X_train, y_train)
The first argument is the model which we want to evaluate. The second argument is the grid configuration we made earlier using python dictionary.
The cv
argument accepts integers and represents the number folds we want our cross-validation to perform.
scoring
represents the strategy employed to evaluate the model in question. Here we are using "accuracy"
which means that we want to evaluate the model using its accuracy score. The default value for this argument is specified by the type of the model being used. For SVC
the usual scoring metric is accuracy.
scoring
can accept multiple datatype, not just string values. See the documentation for details.
You could also add a verbose
argument which accepts an integer value from 1
to 3
. It will give you more detailed logs of what is going on. Here is what each one does:
1
👉 the computation time for each fold and parameter candidate is displayed2
👉 the score is also displayed3
👉 the fold and candidate parameter indexes are also displayed together with the starting time of the computation
Step 3: Get the best scores
This step is easy. All it requires you to do is access the best_params_
attribute from the processed GridSearchCV
object. This gives you a dictionary output.
best_params = grid_search.best_params_
best_score = grid_search.best_score_
best_score_
gives you a float value which represent the best accuracy score in this scenario.
Step 4: Final model training
Now you will train the model again, but this time using the parameter values which got the highest scores.
final_model = SVC(
C = best_params['C'],
kernel = best_params['kernel'])
final_model.fit(X_train, y_train)
Step 5 (Optional): Visualize the grid
Since GridSearchCV
uses a list of possible combinations with the given param_grid
dictionary, we can visualize that grid as a heatmap
.
For this purpose, you will need to import some new dependencies.
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
# Access the grid search results
results = grid_search.cv_results_
# Extract relevant data into a DataFrame
data = pd.DataFrame({
'C': results['param_C'],
'kernel': results['param_kernel'],
'mean_score': results['mean_test_score']
})
# Reshape the data into a pivot table format for the heatmap
heatmap_data = data.pivot(index='C', columns='kernel', values='mean_score')
# Create a heatmap using seaborn
plt.figure(figsize=(10, 6))
sns.heatmap(heatmap_data, annot=True)
plt.xlabel('Kernel')
plt.ylabel('C')
plt.title('Grid Search Results Heatmap')
plt.show()
cv_results_
is a dictionary of arrays which contain all the combination of values the grid search object tried out. The correspondence of combinations is represented by the index of each array.
Everything else is self-explanatory. Here is what your heatmap might look like:
The values in each cell represents the accuracy score of each combination of the parameter values yield.
Thank you for reading till the end 🥳
Now you are able to find the best parameters for your model more quickly and efficiently.
If you found this article helpful and learned something new, be sure to leave a 👏 or let me know 📣 your thoughts on it. I’d love to listen to your feedback.