File Management during LLM (Large Language Model) Trainings by Optuna v4.0.0 Artifact Store
TL;DR
- Artifact Store, which manages files generated during optimization by various file storage such as local file system and Amazon S3, is supported officially in Optuna v4.0.0,
- Artifact Store enables users to view or check a wide range of file formats such as image and audio files on Optuna Dashboard,
- Optuna v4.0.0 extended the Python API, making it easier and more convenient to download artifacts from Artifact Store,
- Furthermore, CSV and JSONL file viewers have been newly supported on Optuna Dashboard and they can now be displayed in the tabular format, and
- In this article, I explain the improved usability using an experiment with a large language model.
I kindly ask readers to defer to a simpler example and Artifact Store Tutorial for more details. Notice that as Optuna v4.0.0 is a beta version, readers need to explicitly install the beta version to reproduce the experiment in this article.
# Optuna Dashboard must be v0.16.0 or later.
$ pip install optuna==4.0.0b0 optuna-dashboard>=0.16.0
What is Artifact/Artifact Store?
Artifact is a file generated or used during an optimization. For example, as seen in the red rectangular of Figure 1, each Trial during hyperparameter optimization of a large language model may generate various files such as a learning curve plot, inference results in the CSV format, and model snapshot files. Artifact Store is very convenient for managing such artifacts and their visualization by Optuna Dashboard.
Users can manage files associated with Trial or Study by Artifact Store officially supported from v4.0.0. As Artifact Store can specify various storage as a save destination, users can store artifacts not only in local file system but also in object storage such as Google Cloud Storage (GCS) and Amazon S3 compatible storage.
As mentioned earlier, an advantage of Artifact Store is to be able to view the contents of artifacts directly on Optuna Dashboard. For example, as shown in Figure 2, the JSONL (or CSV) file is displayed in tabular format. Furthermore, it is possible to play an audio or a video file on Optuna Dashboard.
Modifications Made in Optuna v4.0.0
Optuna v4.0.0 enhanced not only the visualization on Optuna Dashboard such as the table artifact viewer introduced in Figure 2, but also Python API to increase the usability. More specifically, we worked on the stabilization of the artifact upload API and the addition of new APIs: the artifact download API and the API to list all the artifact metadata, which is necessary for the download, associated with a specific Trial or Study.
With these changes, it will be much easier to make use of artifacts for post-hoc analysis or the artifacts connected to the best Trial from user scripts. For example, if each Trial uploads the compressed file of LLM snapshots to Artifact Store, the snapshots for the best Trial can be easily downloaded to the local file system via the new API. In the next section, I would like to demonstrate the API usage with an actual code.
Use Case of Artifact Store: Hyperparameter Optimization of LLM
In this section, I would like to explain the use case of Artifact Store for a local file system. First, we optimize the hyperparameter of an LLM using Optuna and show the results on Optuna Dashboard. The actual code is available on Gist.
In this example, each Trial uploads the following Artifact files:
- The training log of LLM (CSV file)
- The responses by the trained LLM to each question (JSONL file)
- The learning curve plot (PNG file)
- The model snapshot file (GZip File)
We first run the code on Gist and launch Optuna Dashboard based on the results obtained by the script:
# Launch Optuna Dashboard with the URL of RDB and the Artifact base_path.
$ optuna-dashboard sqlite:///demo.db --artifact-dir artifacts
The following video shows how Optuna Dashboard will look when we launch it:
To illustrate the Python API usages, I picked and modified the code on Gist below:
import optuna
# Artifact will be stored in this directory.
base_path = "artifacts"
# Create the directory.
os.makedirs(base_path, exist_ok=True)
# Instantiate an Artifact Store with the directory path.
artifact_store = optuna.artifacts.FileSystemArtifactStore(base_path)
def objective(trial):
# Suggest hyperparameters by Optuna.
train_params = suggest_train_params(...)
# Train an LLM using the hyperparameters suggested by Optuna.
trainer = ...; trainer.train()
# Record the responses by LLM to each question as a JSONL file.
inference(...)
# Upload the JSONL file to Artifact Store.
optuna.artifacts.upload_artifact(study_or_trial=trial, file_path=inference_path, artifact_store=artifact_store)
# Upload the learning curve plot, log, and snapshots in the same way.
...
valid_loss = ...
return valid_loss
storage = optuna.storages.RDBStorage("sqlite:///demo.db")
study = optuna.create_study(storage=storage, study_name="demo")
study.optimize(objective, n_trials=10)
In this example, each artifact is uploaded to base_path
specified in FileSystemArtifactStore
using upload_artifact
, which is one of the Python APIs. As in the example above, the upload can be performed by only one line as long as the file already exists.
Additionally, the download of the model snapshot for the best trial can be easily done from user scripts using the new APIs:
# Get the best Trial.
best_trial = study.best_trial
# The file name used for the uploads of model snapshots in each Trial.
model_file_name = "model.tar.gz"
# Get all the artifact metadata associated with the best Trial.
artifact_meta = optuna.artifacts.get_all_artifact_meta(trial, storage=storage)
# Get the Artifact ID of the model snapshot file.
artifact_id_for_model = [am.artifact_id for am in artifact_meta if am.filename == model_file_name][0]
# Download the model snapshot trained in the best Trial to download_file_path.
download_file_path = "./best_model.tar.gz"
optuna.artifacts.download_artifact(
artifact_store=artifact_store,
file_path=download_file_path,
artifact_id=artifact_id_for_model,
)
As shown above, the model snapshot can be easily downloaded with the new APIs by specifying a download path. The new APIs make the reuse of artifacts much simpler.
Conclusion
Optuna v4.0.0 enhanced the Python APIs for the file management mechanism Artifact Store. As demonstrated in this article, the reuse of artifacts from user scripts became much simpler. Besides this, the visualization of artifacts in Optuna Dashboard is also improved and users can now view CSV and JSONL files in the tabular format. Last but not least, Tutorial and a simpler example are also available for Artifact Store, please check the tutorial as well!