Introducing the Stabilized JournalStorage in Optuna 4.0: From Mechanism to Use Case

Hiroki Takizawa
Optuna
Published in
4 min readSep 18, 2024

Introduction

The default storage class in Optuna is InMemoryStorage. However, InMemoryStorage does not persist trial histories and cannot be used for distributed optimization. Therefore, Optuna provides multiple storage classes.

In Optuna 4.0, JournalStorage and JournalFileBackend, which are among the storage classes, have been officially supported. In this blog, we will introduce these technical points, as well as their use cases and how to utilize them.

About JournalStorage

JournalStorage is one of the storage classes in Optuna. The name originates from the fact that it records the operational logs of Optuna in a stacked, journal-like manner. The primary motivation for its introduction was to make it easier to implement various backends (such as databases) as storage for Optuna. To achieve this, the design separates the responsibility, where the JournalStorage class acts as Optuna’s storage, while another class is responsible for reading and writing to the backend. The JournalStorage class is designed to accept objects of classes prepared for each backend during initialization (see Fig. 1).

Figure 1: A diagram showing the relationships between JournalStorage and related classes with arrows. The JournalStorage class, shown in gray in the top, takes the backend class objects, shown in blue, as arguments. There is a backend class that can also take another class related to a locking mechanism, shown in green, as an optional argument.

JournalStorage was experimentally introduced in Optuna v3.1, and the official support began in v4.0. With the official support, backward compatibility of log files will be guaranteed. Additionally, class names and module paths have been reorganized, and stability has been improved.

A simple code example using JournalStorage can be written as follows:

import optuna
from optuna.storages import JournalStorage
from optuna.storages.journal import JournalFileBackend


def objective(trial):
xs = [trial.suggest_float(f"x{n}", -1.0, 1.0) for n in range(3)]
return sum(x ** 2 for x in xs)


storage = JournalStorage(JournalFileBackend("./optuna_journal_storage.log"))
study = optuna.create_study(storage=storage)
study.optimize(objective, n_trials=20)

As mentioned earlier, the JournalStorage class takes a class object that implements the functionality to access the backend (in this code, the JournalFileBackend class) as an argument during initialization. This is one of the key features of JournalStorage, making it an easy method to add various backends to Optuna. As of Optuna 4.0, in addition to JournalFileBackend, the JournalRedisBackend class has been implemented to use Redis, a famous NoSQL database, as a backend. (However, the focus of stabilization in this release is limited to JournalFileBackend.)

For further details on JournalStorage, please also refer to the past blog post.

About JournalFileBackend

The JournalFileBackend class provides storage functionality compatible with distributed optimization. It can be used by passing an object to the initialization of the JournalStorage class, as shown in the sample code above. The greatest advantage of JournalFileBackend is that it enables distributed optimization via Network File System (NFS). This is achieved by implementing mutual exclusion using system calls defined as atomic by the NFS specification. Two methods for acquiring locks are each implemented as separate classes, and you can switch between them by passing an optional argument during the initialization of the JournalFileBackend class. For usage instructions, please refer to the documentation, and for further details on the mechanism, please refer to the past blog post, which explains this alongside details of JournalStorage.

In addition to JournalStorage, another method to store Optuna’s studies is RDBStorage. RDBStorage supports SQLite3 as well as MySQL. Like JournalFileBackend, SQLite3 can also use a single file as storage, making it a convenient option. However, it is known that when the SQLite3 file is located on an NFS, simultaneous access from multiple nodes or processes does not function well. This issue is mentioned in the official SQLite FAQ. Given these considerations, JournalFileBackend is the only method available for distributed optimization in Optuna 4.0 when using a single file on NFS as storage.

A Use-Case

While executing large-scale distributed optimization using RDBStorage and MySQL, the load on the MySQL server became a bottleneck during the analysis of the optimization results. To streamline the analysis process and reduce the load, we considered migrating the studies stored in MySQL to another storage using the optuna.copy_study function. While we also looked into converting to storage options like Redis and SQLite3, we found that converting to JournalStorage with JournalFileBackend was the fastest and most suitable for our workload. This allowed us to avoid placing additional load on the MySQL server and perform the analysis more quickly.

Changes for Stabilization in Optuna 4.0

In Optuna 4.0, class names and module paths were reorganized to make the API more intuitive. The code in the example above conforms to the specifications of v4.0. (Excerpted and re-posted below:)

from optuna.storages import JournalStorage
from optuna.storages.journal import JournalFileBackend
storage = JournalStorage(JournalFileBackend("./optuna_journal_storage.log"))

Code written using the v3.x syntax will continue to function with backward compatibility for the time being, but it is deprecated. For details about the new module paths, please refer to the documentation and migration guide.

All log files created since the release of v3.1 will remain usable in v4.0 and beyond! Please use the file from the JournalFileBackend class in the same way as shown in the code example above.

Conclusions

In this article, we introduced the usage and a practical use-case of the storage features stabilized in Optuna 4.0. JournalStorage with JournalFileBackend is a powerful method that supports distributed optimization while only relying on NFS. We encourage you to give it a try!

Optuna 4.0 has made significant progress in many areas, in addition to stabilizing JournalStorage. For more details, check out the Optuna 4.0 release blog!

--

--

Hiroki Takizawa
Optuna
Editor for

I obtained my Ph.D. in Bioinformatics from the University of Tokyo. I am presently working at Preferred Networks Inc.