Announcing Optuna 3.0 (Part 2)

mamu
Optuna
Published in
7 min readAug 29, 2022

This is the second half of the release blog written for the Optuna v3.0 release. If you have not yet read the first half, please start there!

In the first half of the blog, we explained how we improved Optuna’s stability in v3.0 and how we benchmarked our optimization algorithms. In this second half, we will discuss the various new features added to Optuna v3.0, as well as some development items that were on the roadmap but were not actually worked on due to twists and turns.

New Features

When we started developing Optuna v3.0, we did not intend to introduce any specific new features. However, as many contributors worked together on the development, we ended up with a number of new features that were considered important to the Optuna community. Here is an overview of those new features.

New NSGA-II Crossover Options

NSGAIISampler is a sampler that performs well in multi-objective optimization and is Optuna’s default algorithm for multi-objective optimization. By default, NSGA-II performs well for categorical parameters, but often does not perform well enough for floating point parameters. With the new crossover option, NSGA-II performs well on real variables. If you are interested in the algorithm and performance evaluation experiments, please visit #2903, #3221, and the document.

This feature was introduced by @yoshinobc and brushed up by @xadrianzetx. Thank you!

Figure comparing NSGA-II using the newly implemented crossover option, the existing NSGA-II (top right), the existing multi-objective optimization method BoTorch (bottom right), and another implementation of NSGA-II (top middle). They are Pareto-front plots obtained after 140 trials of optimization on a problem called NAS bench 101

A New Algorithm: Quasi-Monte Carlo Sampler

It is known that random search is difficult to adequately cover the search space for high-dimensional problems. Quasi-Monte Carlo search is a method that can cover the search space with a relatively small number of samples in high dimensions. The newly implemented QMCSampler is a sampler that can replace RandomSampler, especially for higher dimensional problems. See #2423 and the document for more information. Also, if you are interested in QMCSampler’s performance, please visit #2964.

This feature was introduced by @kstoneriv3 Thank you!

Constraints Support for Pareto-front Plot

Pareto-front plot is an important analytical method for evaluating the results of multi-objective optimization. We can perform multi-objective optimization considering constraints, but until now, Optuna’s Pareto-front plot could not distinguish whether each trial satisfied the constraints or not. With the introduction of this feature, it is now possible to see at a glance which trials satisfy the constraints and which do not. For more information, please see the following PRs (#3128, #3497, and #3389) and the document.

This feature was introduced by @semiexp for the Plotly version and @fukatani for the Matplotlib version, and the implementation was unified by @semiexp. Thank you!

A New Importance Evaluator: Shapley Importance Evaluator

In Optuna, the feature to evaluate the hyperparameter importance was introduced in v2. It is one of the analysis methods that can be used to determine which hyperparameters are important based on optimization results. In v3.0, we introduce a new importance evaluator, optuna.integration.ShapleyImportanceEvaluator, which uses SHAP. See #3507 and the document for more information.

This feature was introduced by @liaison. Thank you!

Constrained Optimization Support for TPE

Tree-structured Parzen Estimator (TPE) is Optuna’s default sampling algorithm. It did not previously support constrained optimization, but now does in v3.0. Below is the code to perform constraint-aware optimization. The right side of the figure shows the improvement. For more information on this feature, please visit #3506 and the document.

This feature was introduced by @knshnb. Thank you!

New History Visualization with Multiple Studies

The optimization history plot is one of the most basic plots for analyzing optimization. Previously, it was only possible to plot the optimization history of a single study, but now it is possible to compare multiple studies or to display the mean and variance of multiple studies optimized with the same settings. For more information, please see the following multiple PRs (#2807, #3062, #3122, and #3736) and the document.

This feature was introduced by @HideakiImamura for the Plotly version and @TakuyaInoue-github for the Matplotlib version, and the implementation was unified by @knshnb. Thank you!

Deliberate Exclusions

We addressed a number of items in v3.0, but not all of them were incorporated. For example, we concluded that some items related to the optuna.storage module would be difficult to complete in v3.0 and we had to postpone development. Below is a list of such items.

  • Simplify the after_trial arguments
    The optuna.samplers.BaseSampler.after_trial is a method that is called after the end of a trial to perform post-processing for each sampler. This method is only used by users who implement their own samplers. Therefore, only a limited number of users will benefit from simplifying this method. On the other hand, in order to modify this method, the trial must be editable after the trial is finished, which causes inconsistency with the base design of Optuna storage. We decided not to work on this item in v3.0 because the risk overweighted possible benefits.
  • Reorganize storages API and make BaseStorage public
    The optuna.storage module is responsible for storing the history of optimizations. Due to its long existence, the optuna.storage module contains several technical debts. For example, it’s not clear whether the abstract class optuna.storages.BaseStorage is exposed to the library user. To resolve the inconsistency, we tried to expose BaseStorage, but decided not to do so in v3.0 because the API is too complex and there is room for improvements. Currently, redesign of BaseStorage is in progress.
  • Change default sampling algorithm
    Even though Optuna’s default sampling algorithm optuna.samplers.TPESampler has evolved significantly since the release of Optuna v1, its default behavior has not changed at all. We have attempted to change the default algorithm, including default arguments to the TPESampler, through benchmarking experiments in order to bring the evolved TPESampler functionality to a larger number of users. However, the experimental results we obtained showed that using the advanced non-default functionality of TPESampler improved performance in many cases, but worsened it in others. Considering the total number of users affected by the change, we concluded not to change the default in v3.0.
  • Rename the name of sample_relative
    The optuna.samplers.BaseSampler.sampler_relative is a method that samples from a joint distribution considering correlations between variables. It was suggested that a name like sample_joint would be more appropriate, and we considered changing the name. However, it was decided that changing the name of a method that is publicly available to users involves risk, while the benefits of the name change would be minimal.

What’s Ahead

Optuna v3.0 was successful thanks to help from many contributors. Optuna’s user and development communities have made great leaps, making the framework an important part of the hyperparameter optimization community. However, the development of Optuna does not end here. We will continue to develop challenging features, and at the same time, we will continue to make it easier for contributors to participate in the development process. Optuna is constantly evolving and we are always looking for new contributors. If you are interested, please reach out to us on GitHub or join one of our upcoming development sprints!

We’ve begun development of v3.1, to which we’re planning to add the following features. Stay tuned!

  • Provide journal based storage that works with NFS-enabled file storages
  • Verify and refactor known performance degradation of TPE
  • Support for discrete variables in CMA-ES

Contributors

As with any other release, this one would not have been possible without the feedback, code, and comments from many contributors.

@29Takuya, @BasLaa, @CorentinNeovision, @Crissman, @HideakiImamura, @Hiroyuki-01, @IEP, @MasahitoKumada, @Rohan138, @TakuyaInoue-github, @abatomunkuev, @akawashiro, @andriyor, @avats-dev, @belldandyxtq, @belltailjp, @c-bata, @captain-pool, @cfkazu, @chezou, @contramundum53, @divyanshugit, @drumehiron, @dubey-anshuman, @fukatani, @g-votte, @gasin, @harupy, @higucheese, @himkt, @hppRC, @hvy, @jmsykes83, @kasparthommen, @kei-mo, @keisuke-umezawa, @keisukefukuda, @knshnb, @kstoneriv3, @liaison, @ll7, @makinzm, @makkimaki, @masaaldosey, @masap, @nlgranger, @not522, @nuka137, @nyanhi, @nzw0301, @semiexp, @shu65, @sidshrivastav, @sile, @solegalli, @takoika, @tohmae, @toshihikoyanase, @tsukudamayo, @tupui, @twsl, @wattlebirdaz, @xadrianzetx, @xuzijian629, @y0z, @yoshinobc, @ytsmiling

Thanks to those who have followed the projects from the very early days and those who have joined along the way.

--

--