AutoSampler: Automatic Selection of Optimization Algorithms in Optuna
Introduction
AutoSampler automatically selects a sampler from those implemented in Optuna, depending on the situation. Using AutoSampler, as in the code example below, users can achieve optimization performance equal to or better than Optuna’s default without being aware of which optimization algorithm to use.
study = optuna.create_study(
sampler=optunahub.load_module(
"samplers/auto_sampler"
).AutoSampler() # Automatically selects an algorithm internally
)
In this article, we introduce AutoSampler, released on OptunaHub on October 31st. We explain why different optimization algorithms are necessary depending on the problems to be solved and the design principles behind AutoSampler’s automatic algorithm selection rules. Then, we describe how to use AutoSampler and provide brief benchmark results.
Why Algorithm Selection Is Needed
There are various optimization problems, and new optimization algorithms are continually being proposed in academic research. There are several reasons behind this.
One of the biggest reasons is a wide variety of problem settings. Some involve both integer and continuous variables, some have multiple objectives, and others are limited by the number of evaluations. To deal with those problem settings effectively, researchers and practitioners are working on developing and improving optimization algorithms.
The Optuna documentation provides a sampler comparison table to help Optuna users select an algorithm according to their problem settings (Table 1). By referring to this table, the users can easily find sampler candidates based on the support status for various problem settings and the maximum number of evaluations.
Optuna offers various algorithms with different characteristics to deal effectively with various problems. Here is a simple example to show the difference between the algorithms. The left side of Figure 1 shows the results of minimizing a benchmark function (an 8-dimensional Styblinski-Tang function) using GPSampler and CmaEsSampler. The right side of Figure 1 shows the results of running the same operation with half of the variables in the function (x4-x8) replaced with integer variables. There is not much difference in the former (CmaEsSampler performs slightly better in this example), but in the latter, GPSampler achieves significantly better results. This is because Optuna’s GPSampler is relatively better at handling integer variables than CmaEsSampler.
So, it is essential to use the appropriate algorithm for each problem. However, for users, understanding each algorithm’s working mechanisms, characteristics, strengths, and weaknesses, and making a proper selection is a difficult task that requires a high level of expertise and can be considered a heavy burden.
AutoSampler
This time, we have gone one step further than the sampler comparison table and developed AutoSampler. It enables users to solve problems well without spending time understanding and selecting samplers. AutoSampler is equipped with sampler selection rules designed to achieve the following two points and automatically and dynamically selects suitable samplers during optimization (Figure 2).
- Automatically selects a sampler that can appropriately handle problems with different settings by considering the number of evaluations, search space, constraints, and the number of objectives.
- Automatically selects a sampler that empirically achieves equivalent or better search results than always using Optuna’s default sampler.
Regarding the second point, we noticed a bias in the problems that Optuna users often encounter, particularly with hyperparameter optimization. Based on this experience, we tried to develop a better sampler selection rule instead of always using the default sampler.*¹
AutoSampler’s approach to automatic sampler selection involves the following.
- Using GPSampler during the early stages of a search, as it has excellent sample efficiency but is only applicable to a small number of evaluations.
- Using TPESampler for problems with categorical variables, as it can handle these variables flexibly.
AutoSampler also has elaborate schemes to improve performance when dynamically switching methods during optimization.
How to Use AutoSampler
It is very easy to use the AutoSampler published on OptunaHub.
First, install dependencies in your environment to use the package, including the optunahub library.
pip install optunahub
pip install -r https://hub.optuna.org/samplers/auto_sampler/requirements.txt
After that, load the "samplers/auto_sampler"
package using the optunahub.load_module
function and instantiate AutoSampler included in the package. Below is a code example that can be run by copying and pasting.
import optuna
import optunahub
def objective(trial: optuna.Trial) -> float:
x = trial.suggest_float("x", -1, 1)
y = trial.suggest_float("y", -1, 1)
return x**2 + y**2
study = optuna.create_study(sampler=optunahub.load_module("samplers/auto_sampler").AutoSampler())
study.optimize(objective, n_trials=300)
print(study.best_trial.value, study.best_trial.params)
Benchmarking
To demonstrate the effectiveness of AutoSampler, we performed a simple comparative evaluation of search performance using benchmark functions. Figure 3 shows the optimization results for Mishra06(2), Hartmann6(6), and McCourt11(8) (the numbers in parentheses indicate the problem dimensions; implementation of the benchmark functions is available here). In addition, for the two-dimensional function Mishra06(2), a landscape plot (upper right of Figure 3) that visualizes the contours of the function values is also shown. This plot’s vertical and horizontal axes correspond to the two variables, and the colors correspond to the objective function values. Darker areas indicate locations with good objective function values.
The results show that AutoSampler consistently achieves performance equal to or better than the default TPESampler, except for the very beginning of the search. First, looking at the gray dotted line for each figure, we confirm that AutoSampler with less than 100 trials found a better solution than the best solution by the default sampler with 1000 trials. Next, the narrowness of the filled band indicates that the search stability is higher than the default sampler.
Regarding the individual problem results, in Mishra06(2), both AutoSampler and TPESampler stopped improving the objective function value before reaching 400 trials. It seems both converged to some local minima. However, there is a significant difference in the final objective function value, i.e., the former can find a better solution on average. In Hartmann(6), there is no considerable difference in the final objective function value, but AutoSampler is significantly superior in terms of convergence speed. In McCourt11(8), both methods continued to improve the objective function value until the end, and the search did not converge, but the pace of improvement in the objective function value indicates that AutoSampler has superior search efficiency.
Wrapping Up
We introduced AutoSampler, which automatically selects the optimization algorithm for Optuna. This helpful feature allows users to empirically obtain performance equal to or better than the default sampler without bothering about which optimization algorithm to use. *² It is easy to use with just one modification line; please try it!
The first version of AutoSampler has room for improvement, such as falling back to the default in the case of multi-objective optimization and constrained optimization, and we plan to continue improving it in the future. We are actively seeking user feedback to improve AutoSampler. If you have any impressions or requests for improvements, please post them in the relevant thread on Optuna Discussions!
Supplements
*¹ According to the no-free-lunch theorem, the average performance of all optimization algorithms across all problems in the universe is equal. Therefore, when discussing the superiority or inferiority of optimization algorithms, some assumptions about the target problems are required.
*² Some readers may wonder, “Why not make AutoSampler the default?” In software development, there are many things to consider besides optimization performance, and the default provides a good balance when considering all of these factors comprehensively. For example, AutoSampler depends on experimental features such as GPSampler and additional packages such as cmaes and torch, but the default does not have such dependencies.