In the summer that just passed, many things happened to me: applying for leave from my Ph.D. program due to medical reasons, a near one-month quarantine, visiting my family and close friends after a two-year’s absence, travelling in the north-western part of China, and, finished my Google Summer of Code 2021 (GSoC 2021) project: the development of Optuna-Dashboard.
To some extent, Optuna-Dashboard is the “TensorBoard” of Optuna, a hyperparameter optimization library for machine learning written in pure Python. During hyper-parameter tuning processes, many variables, e.g., weight initialization methods, network depth in DNNs, learning rate, etc., influence the final model performance. Due to the high dimensional and diversiform (hyper-parameters could be integers, float numbers, and categorical values.) nature of these variables, it’s hard to make intuitive sense of how each variable contributes to the performance metric and play with other variables. Optuna-Dashboard provides an integrated solution to this problem by presenting various visualization results in a web browser. It’s interactive and updated in real-time, therefore, it makes it convenient to explore and monitor the hyper-parameter optimization process, locally or remotely.
My task in this year’s GSoC project is to work with Optuna-dashboard committers to develop and improve Optuna-dashboard. My work mainly focused on the following topics:
- Porting more visualization functions to Optuna-Dashboard
- Add support for multi-objective studies
- Add a preference panel
- Add unit tests
- Exploration on replacing the backend of some analysis methods with WebAssembly modules
The way Optuna-dashboard works is to launch a backend server on the host machine, which first connects to the Optuna storage engine, then serves both the static files for a web app (written in React) and a set of APIs (written with Bottle) that call the analysis methods from Optuna. Once the web app is loaded and the study is specified, the frontend side periodically makes requests to the backend to ask for trials data for visualization. Actually, Optuna itself provides many visualization functions out-of-the-box with the Matplotlib and Plotly backends. However, they cannot be used directly as it’s designed to be used with scripts and iPython notebooks locally. This summer, we ported nearly all visualization functions natively in Optuna to Typescript in the web app.
Add support for Multi-objective studies
We also bring another advantage of Optuna to the Optuna-dashboard: supports for multi-objective studies. Unlike the case of single objective studies, a multi-objective study requires more interactions to present the visualization results as users are not supposed to switch targets with Python code snippets (code injections could be dangerous) in the dashboard interface.
Add a preference panel
Besides that, we added a preference panel to the dashboard. The re-analysis-plotting process could be slow if a study has too many trials, and some visualization graphs could be unnecessary if the user just needs a few of them. Adding a panel that stores the graph selection choices in the browser local storage allows users to customize the dashboard themselves.
Add unit tests
For unit testing, my work in this part mainly involves adding tests for the frontend side. It turns out to be harder than I thought. At first, we think a simple testing library will be enough for tests, but it turns out we were wrong. Though tests for graphs are easy to write, things became harder when it comes to the table part. There’re two major frameworks to choose from: JEST and React testing Library. The former provides APIs to manipulate the elements inside a table but works not on the rendered HTML output but on the node tree, besides, it did not support React 17. The latter one instead focuses on the rendered output but has poor methods to identify and extract elements, which could be crucial if one needs to check specific columns with conditions in a table.
Following the spirit of testing, we all agreed that a testing library should directly measure and interact with the final interface (rendered HTML) so the React testing Library became our choice. But its poor API makes the potential table tests very cumbersome, if not completely impossible.
Exploration on replacing the backend of some analysis methods with WebAssembly modules
After finishing these parts, our project gradually settled down and I returned to China. We tried another idea to replace the parameter importance API with a WASM module. We failed on this idea as Golang I used to implement the random tree algorithm lacks mature libraries for the corresponding complicated operations and I have no time to implement one from scratch. The current existing solution is somewhat ad-hoc: though for most visualizations, we have JS implementations which are functionally equivalent to those APIs with same names and used by Optuna and only require trials data to compute, while for the parameter importance visualization, we query the backend to compute the importance values. This introduces inconsistency and redundancy in the system. Besides, as the Optuna-dashboard is usually used to monitor the optimization progress, it sends queries periodically with a relatively short interval, which will consume a lot more resources as every call of parameter importance evaluation could be heavy and needs to start from the beginning while the importance value may not change a lot in the interval.
Rethinking Optuna-dashboard architecture
This failed exploration and some problems that occurred in the previous parts, brought me to rethink the architecture of the Optuna-dashboard: For visualization APIs, on the one hand, the current solution makes the API requests simple most of the time as it only requires trial data for plotting and the computation load is on the client-side, which releases the backend from heavy computations. On the other hand, however, this creates redundancy in the library and the DRY principle, we have two sets of functions for one set of purposes. If Optuna changes a single visualization API in the future, we will have to change the dashboard one simultaneously. Of course, this won’t be a big issue if Optuna enters a “serenity” phase.
But an alternative solution could be as follows: instead of asking for raw data of the trials list, which could be very heavy if one has a long-standing optimization study, the client directly asks the server-side to compute and render all visualization results. This 1) requires no more separated implementations for the same visualization function and 2) reduces both the amount of data communicated and the resource consumption in the browser. Currently both Matplotlib and Plotly support returning fig objects in either PNG or HTML content format so it’s plausible technically.
It could be problematic if we have multiple clients connected to the same backend but luckily this is not how Optuna-dashboard is designed to be used. And even that’s the case, the computation load could be reduced through a cache mechanism: only when trials are updated will the backend recalculate all visualizations. But what’s the price of doing so? I think it will make it harder to create interactive graphs as all re-calculations occur in the backend and could be slow.
Optuna-dashboard is a terrific tool for monitoring and analyzing optimizations in Optuna. This summer, we helped to improve it through GUI/Unit tests, API, etc. Overall, we fulfilled all planned required proposals. But regrettably, we didn’t finish some optional ones: we haven’t had time to explore all ideas in the above discussion and some other proposals (e.g., a JupyterLab extension) this year due to many reasons. But I feel lucky because I’ve learned a lot from this project, from testing to team cooperation via Github. So, I’d like to thank my supervisors in the GSoC project, Masashi Shibata and Crissman Loomis. The working experience with you guys is awesome and really pleasant.