Scaling up Optuna with Ray Tune

Kai Fricke
Oct 15, 2020 · 4 min read

By Kai Fricke, Crissman Loomis, Richard Liaw

Image for post
Image for post

Optuna and Ray Tune are two of the leading tools for Hyperparameter Tuning in Python. Optuna provides an easy-to-use interface to advanced hyperparameter search algorithms like Tree-Parzen Estimators. This makes it an invaluable tool for modern machine learning engineers or data scientists and is a key reason for its popularity.

Optuna excels at single machine workloads, but parallelizing these workloads requires manual operation, possibly on multiple machines, and monitoring ability is not included. This can make operation especially challenging if you want to leverage GPUs as efficiently as possible. You not only need sensible choices for your parameters, but also a way to organize the execution. And this is where Ray Tune shines.

Ray Tune takes your training function and automatically parallelizes it, takes care of the resource management, and can even distribute it across a cluster of machines. And all you have to do is run a single script! In a perfect world, we would be able to use both the great algorithms from Optuna with the great scaling capabilities of Ray Tune.

Fortunately, this world exists! Ray Tune integrates with many hyperparameter searching algorithms and frameworks — and Optuna is one of these frameworks! Scaling up your Optuna hyperparameter search is just a matter of wrapping it in a Ray Tune run and making minor changes to your training function.

Even better, if you use Ray Tune as the computation backend, you can leverage advanced scheduling algorithms like Population Based Training that are currently not available in Optuna. You really get the best of both worlds!

Image for post
Image for post

How it works

Let’s take a look at the integration of Ray Tune with Optuna.

As you will see, using Optuna search algorithms with Ray Tune is as simple as passing search_alg=OptunaSearch() to the call!

All you need to do to get started is install Ray Tune and Optuna:

In this blog post we will use this PyTorch model to train an MNIST classifier from the Ray Tune examples. Here’s how the code would look like in Optuna:

If we want to do the same thing in Ray Tune, the code is very similar:

As you can see, the objective function is almost identical. Ray Tune automatically converts the search space to be compatible with Optuna and uses Optuna’s search algorithms for hyperparameter suggestions. Per default, these are the Tree-Parzen Estimators, but you can specify any search algorithms available in Optuna.

How it performs

Let’s do a simple comparison between the single-threaded Optuna implementation and the implementation with Ray Tune.

When we run the Optuna example, we get these (or similar) results:

Training 10 trials in parallel for 20 epochs each took us about 37 seconds on my machine.

Let’s see how Ray Tune performs:

As you can see, the training run with Ray Tune finished about three times faster than with single-threaded Optuna. This is because Ray Tune was built from the ground up to be scalable and to leverage parallelism as much as possible.

Image for post
Image for post

Even better, because of Ray’s distributed computing model, scaling our tuning up to a cluster of tens, hundreds, or even thousands of nodes is just a matter of using the Ray Autoscaler — and you’ll still just have to call this same single script, with no code changes at all.


The two leading tools for Hyperparameter Tuning in Python, Ray Tune and Optuna, can be used together to get the best of both worlds. Ray Tune offers great parallelization and scalability across clusters of hundreds of machines, as well as advanced scheduling algorithms like Population Based Training. Optuna on the other hand brings state-of-the-art search algorithms to the table, allowing it to find the best hyperparameter combinations more efficiently.

Combining these two libraries only has benefits — and really brings hyperparameter optimization to the next level.

Are you using Optuna and trying to scale up your hyperparameter tuning with Ray Tune? We’d love to hear from you! Join our Slack here!


A hyperparameter optimization framework

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store