Hyperparameter Optimization with Optuna and RAPIDS
Hyperparameter optimization(HPO) is the process of selecting values for the model’s hyperparameters to build the most accurate estimator possible. Done right, HPO boosts the performance of the model with minimal intervention. But without the right tools, it can become slow and labor-intensive. Automating the process and ensuring it doesn’t take days to run a single experiment can save time for almost any data scientist. Optuna is a lightweight framework, developed by Preferred Networks, which aims to automate the process of HPO. RAPIDS provides GPU-accelerated libraries to execute end-to-end data science pipelines, accelerating both ETL and ML training steps.
Combining Optuna and RAPIDS libraries can help run experiments faster yielding better performing models. These can be integrated into any workflow easily with just a few changes, as illustrated in the Optuna RAPIDS demo notebook. The notebook uses BNP Paribas Cardif Claims Management dataset from Kaggle and is used to predict if a claim will receive accelerated approval or not. We were able to reduce the log loss by nearly 40% (from 14.1 to 8.2) with a few lines of code for automated tuning. Performance improvements will get better with larger datasets and wider parameter ranges.
This approach can be scaled up via the easy-to-use Dask distributed framework, particularly with the new dask-optuna plugin, which was recently released by James Bourbeau from Coiled.io. The dask-optuna plugin simplifies the process of configuring a cluster for Optuna HPO, removing the extra step of setting up a database backend.
The RAPIDS HPO webpage is a great starting point for anyone looking to learn about GPU-accelerated, automated hyperparameter optimization, and it covers many frameworks in addition to Optuna. For additional deep-dives focused on HPO using Optuna, here are some additional resources:
- Check out this talk from JupyterCon 2020 which walks through this demo Jupyter notebook on how we can use RAPIDS, Optuna, xfeat, and MLflow to illustrate the use of feature engineering and hyperparameter optimization on a classification problem, in conjunction with experiment tracking and eventual production deployment.