Harnessing Human Insight: The Power of Human-in-the-Loop Optimization with Optuna(PART-1)

mayank khulbe
The Good Food Economy
9 min readApr 8, 2024

--

In the realm of optimization, where algorithms tirelessly search for the most efficient solutions, a new approach emerges: human-in-the-loop optimization. This methodology, often abbreviated as HITL, bridges the gap between raw computational power and human expertise, creating a collaborative environment where both elements complement each other. Imagine embarking on a quest for the optimal solution, equipped not only with advanced algorithms but also with the nuanced understanding and intuition of a human guide.

Human and Machine interaction

In tasks such as image generation, natural language processing, or speech synthesis, evaluating results purely through mechanical means can be challenging due to the subjective nature of the output. Human evaluation becomes crucial in discerning subtle nuances and ensuring the quality and authenticity of the generated content. Human-in-the-loop (HITL) optimization proves invaluable in these scenarios by integrating human feedback directly into the optimization process ultimately enhancing the overall quality and relevance of the results.

Meet Optuna and Optuna Dashboard: Your Optimization Ally

Optuna is a versatile Python-based framework designed for hyperparameter optimization. Its primary function is to streamline the optimization process by efficiently exploring hyperparameter search spaces, aiding in tasks like tuning machine learning models and computational workflows.

Optuna also offers a user-friendly interface and integrates seamlessly with an interactive dashboard, known as the Optuna dashboard. This dashboard provides users with real-time visualisation of optimization results.

Curious to get started with Optuna for hyperparameter tuning? Explore my previous blog for an insightful guide and unleash the power of optimization.

Now, equipped with a foundational understanding of both Human-in-the-Loop optimization and Optuna’s capabilities, let’s embark on the practical implementation. We’ll dive into Python and leverage Optuna and Optuna Dashboard to implement HITL in Generative Adversarial Networks (GANs), using the MNIST dataset as our canvas. Let’s explore how this fusion of human insight and computational power elevates the art of optimization to new horizons.

Implementing HITL Optimization with Optuna

Before we dive into the hands-on implementation, it’s essential to establish an understanding of Generative Adversarial Networks (GANs) which serve as the cornerstone of our chosen use case. By grasping the fundamentals of GANs, we lay a solid foundation for our exploration and implementation of Human-in-the-Loop (HITL) optimization techniques.

Two types of HITL optimisations provided by optuna are Optimization using Objective Form Widget and Preferential Optimization

What are GANs?

Generative Adversarial Networks, or GANs, are a class of machine learning models designed to generate new data samples that are similar to a given training dataset. The key idea behind GANs is to train two neural networks simultaneously: a generator and a discriminator.

GANs training flow
  • Generator — It is a neural network (CNN or a FFN) responsible for generating fake samples. It takes random noise as input and transforms it into a sample that resembles real data. Without continuous training, it learns to produce increasingly realistic samples over the course of training.
  • Discriminator — It is a neural network responsible for distinguishing between real and fake samples. It is trained to differentiate between real data and data generated by the generator. Its aim is to improve its ability to accurately classify samples as real or fake.
  • Feedback Loop —The feedback loop is an iterative process where the generator and discriminator engage in a continuous competition. Initially, the generator produces samples from random noise. The discriminator then evaluates these samples, aiming to distinguish between real and fake data. As the discriminator provides feedback by classifying samples, the generator adjusts its parameters to generate more convincing samples, attempting to deceive the discriminator.

Think of the relationship between the generator and discriminator like a game of a thief and detective. The generator acts as the crafty forger, creating fake images from random noise to fool the discriminator. Meanwhile, the discriminator plays the role of the sharp-eyed detective, learning to spot the real images from the fake ones.

Generator and Discriminator

As they go through this game-like training process, both the generator and discriminator try to outwit each other, improving their strategies along the way. The generator’s goal is to make its fake images as similar to real ones as possible, while the discriminator aims to get better at telling the real images apart from the fake ones.

So, in simple terms, the generator tries to make its mistakes smaller (minimize the loss), while making it difficult for the discriminator to catch those mistakes (maximise the loss). This back-and-forth pushes the GAN towards a balance where the generator creates images that are hard to distinguish from real ones, and the discriminator gets better at spotting the differences.

The primary focus of this blog is to explore Human-in-the-Loop (HITL) optimization techniques using Optuna in Python, with Generative Adversarial Networks (GANs) serving as our practical application. Rather than delving deeply into GANs themselves, our goal is to leverage them as a vehicle for understanding and implementing HITL optimization strategies with Optuna.

HITL using Optuna

How HITL work in Optuna?

As mentioned earlier, Human-in-the-loop (HITL) is a concept where humans play a role in machine learning or artificial intelligence systems. In HITL optimization, in particular, humans are part of the optimization process. HITL optimization is valuable in areas where human judgment is essential, like art and design, since it’s hard for machines to evaluate the output. For instance, it can optimize images created by generative models.

To implement HITL optimization, you need a way to interactively execute the optimization process, typically through a user interface (UI). Optuna dashboard comes to your rescue by providing a user-friendly interface to facilitate this interaction.

HITL using the optuna dashboard

In HITL optimization using Optuna Dashboard, there are primarily three components:

  • A script that samples the hyperparameters from optuna and generates output for evaluation (Generated images in this case)
  • Database and File Storage to store the experiment artefacts data (data, files, etc.).
  • Optuna Dashboard for displaying the outputs stored in the storage and making evaluations.

Given that Optuna is our optimization tool of choice, the Next step is to define an objective function. This function acts as our compass and returns the metric(s) we seek to minimize or maximize within our model.

Optimization using Objective Form Widget

Optimization using the Objective Form Widget is one of the key methods for integrating Human-in-the-Loop (HITL) optimization in Optuna. It optimizes machine learning models by seamlessly incorporating human feedback into the process. With customizable forms, users define objectives and gather subjective evaluations, augmenting traditional objective functions. This fusion of user preferences and expertise cultivates refined model outcomes, ultimately heightening user satisfaction.

Since the purpose of this blog is to implement HITL in optuna using GANs, we will jump straight to define the objective function for hyper-parameter tuning.

Feel free to refer to the code and explore the architectures of the generator and discriminator by clicking on the GitHub repository link.

So, let’s start by defining our objective function;

Heavy code, right? Worry not, I will explain the complete code bit by bit.

  • #1study.ask() is an Optuna method to start a new trial, allowing dynamic parameter definition beyond the trial itself.
  • #2 — After defining the trial, the process proceeds to configure the model and set up the data loader, followed by training the complete GANs model.
    #2.1trial.suggest_float selects hyperparameters from a given range, while trial.suggest_categorical selects from specified values.
    #2.2 — With the model configurations set, both the generator and discriminator are trained in batches. The losses of both components are then displayed after each batch within an epoch.

Note: A single trial involves training a Generative Adversarial Network (GAN) for a specified number of epochs (num_epochs)

  • #3 — Following each trial, a grid of new images is generated. These images are stored in the artefact store using upload_artifact, and their paths are retrieved using get_artifact_path.
  • #4 — Finally, the generated grid of images, along with the generator and discriminator losses, is displayed in the note using save_note. This information will be visible in the Optuna dashboard.

Note: The objective function returns 2 losses, the generator and the discriminator loss since we plan to create a multi-objective optimisation using optuna.

Now that the model training function is prepared, we’re poised to optimize the model through hyperparameter tuning.

Optimisation

In the above code,

  • #1 — This initializes an Optuna study object named “HITL_with_optuna_for_digit_generation” for optimization. It configures optimization directions to minimize and maximize for the generator and discriminator losses, respectively. Additionally, it sets up storage using a SQLite database.
  • #2 — The optimization objectives (metrics) are defined using the set_metric_names method. Two objectives are specified: one for assessing satisfaction with generated images and another for evaluating satisfaction with the discriminator’s performance.
  • #3 — Custom ChoiseWidget are registered for inputting user feedback during optimization. These widgets provide options for users to rate their satisfaction with the generator’s and discriminator’s performance. Each ChoiceWidget includes choices (like “Yes 👍”, “Somewhat 👌”, “No 👎”) and corresponding values to capture user feedback. For more widgets, feel free to click here!
  • #4 — This section initiates a loop to start the optimization process. The number of currently running (TrualState.RUNNING) trials is periodically checked to ensure that six (n_batch = 6) trials are running simultaneously.

Next, we start the optimisation by defining the final function.

  • #1 — The FileSystemArtifactStore is created, which is one of the Artifact Store options used in the Optuna. Artefact Store is used to store artefacts (data, files, etc.) generated during Optuna trials(included in the train_GANs function). Also, the tmp folder is created to store the grid of newly generated images.
  • #2 —start_optimisation() the function is called to run the trials for the optimisation process.

Once you’ve executed the entire script, open a new terminal window and navigate to the root folder where the script is located. Then, run the Optuna dashboard locally by executing the following command:

$ optuna-dashboard sqlite:///db.sqlite3 --artifact-dir ./artifact --port 5004

Now that you’ve executed the shell command, Get set to witness the captivating optimization process with Optuna unfold on the Optuna dashboard.

HITL optimization in GANs

The above video showcases the Optuna dashboard in action as it navigates through the process of HITL (Human-in-the-Loop) optimization. It demonstrates the seamless integration of human feedback into the optimization loop, emphasizing the collaborative nature of HitL optimization. However, there are a few important points to notice from this demonstration, shedding light on the effectiveness and usability of Optuna in HitL optimization scenarios.

  • The note space within each trial displays the grid of new images generated by the generator for easy and convenient evaluation of the model.
  • Optuna dashboard empowers the user to choose the appropriate option from the widget for each metric based on their subjective evaluation of the images. This assigns values to these predefined metrics for each trial.
  • The maximum number of concurrent trials is 6, defined by n_batch = 6. A new trial is created when one of the existing trials concludes.

Thank you for reading this blog post on HITL optimization with Optuna. I hope you found it informative and valuable for your optimization journey. If you have any questions or comments, please feel free to leave them below.

Stay tuned for the next part of this blog, where we’ll delve into another type of HITL optimization offered by Optuna: Preferential Optimization.

Until then, happy optimizing!

If you feel any need to visit the code, feel free to click the GitHub repo link.

References

--

--

mayank khulbe
The Good Food Economy

🔍🧠 Data Scientist | Unraveling Insights in the World of ML & DL 🚀 | Transforming Data into Actionable Intelligence | Passionate of Solving Complex Problems