Azure AutoML for Images: Automode
Train high quality models at-ease with Automatic Hyperparameter Tuning
Introduction
When training complex deep learning models, there is an extensive list of hyperparameters that need to be set and tuned to get an optimal performance. These hyperparameters include learning rate, image size, batch size, number of epochs, etc. The large number of hyperparameters makes it difficult for the users to decide which ones to focus on. Also, some of these parameters are model-specific and do not contribute equally to model’s performance making it more challenging for the user. Consequently, it requires a lot of time, effort, and resources to tune these hyperparameters to get the optimal model.
In this blog post we will cover how AutoML for Images can automatically tune and get an optimal model given a budget specified via max number of trials. For more information about AutoML for Images, you can check out the General Availability announcement, the official documentation and the github notebooks.
Hyperparameter Tuning with AutoML for Images
Hyperparameter tuning consists of finding a set of optimal hyperparameter values from the space of possible hyperparameters. Hyperparameter tuning on AzureML works by running multiple trials as part of a training job for each hyperparameter configuration. Here is an example on how to manually sweep over some of the hyperparameters using AutoML for Images for two different models.
image_object_detection_job = automl.image_object_detection(
compute="gpu-cluster",
experiment_name="image-object-detection-sweep",
training_data=training_data,
validation_data=validation_data,
)
# Configure search space for hyperparameter tuning
image_object_detection_job.extend_search_space(
[
SearchSpace(
model_name=Choice(["yolov5"]),
learning_rate=Uniform(0.0001, 0.01),
model_size=Choice(["small", "medium"]), # model-specific
),
SearchSpace(
model_name=Choice(["fasterrcnn_resnet50_fpn"]),
learning_rate=Uniform(0.0001, 0.001),
optimizer=Choice(["sgd", "adam", "adamw"]),
min_size=Choice([600, 800]), # model-specific
),
]
)
# Configure number of trials
image_object_detection_job.set_limits(
max_trials=10,
max_concurrent_trials=5
)
# Submit job to AzureML Workspace
client.jobs.create_or_update(image_object_detection_job)
Users need to provide a search space, a sampling algorithm, and the limits of the job: how many trials (sampled configurations) are going to be tried.
In this approach, the user has more control and flexibility, but it is also user’s responsibility to define and choose which hyperparameters to tune. This requires a deep understanding of hyperparameters, expertise in the area and insights into which parameters matter the most. Often there is no strong baseline to compare against and searching for an optimal set of hyperparameters ends up being the most time-consuming and costly part of training an optimal model.
Introducing Automode
Automode automatically selects model hyperparameters (batch size, learning rate, number of epochs, etc.) given a budget in the form of maximum number of trials. This feature allows the less experienced users to produce good quality models without performing any manual hyperparameter tuning, and more experienced data scientists to save time/effort on building strong baselines for experimentation.
Here is an example of how you can run Automode using AutoML for Images:
image_object_detection_job = automl.image_object_detection(
compute="gpu-cluster",
experiment_name="image-object-detection-automode",
training_data=training_data,
validation_data=validation_data,
)
# Configure number of trials
image_object_detection_job.set_limits(
max_trials=10,
max_concurrent_trials=5
)
# Submit job to AzureML workspace
client.jobs.create_or_update(image_object_detection_job)
Advantages of Automode
- Users can come up with strong baselines with little or no effort.
- Users do not need to provide any details about the space to search over. The underlying algorithm will automatically determine the region of the hyperparameter space to sweep over.
- Automode adapts to the hardware (AzureML compute instance or cluster) chosen by the user. It will determine batch size, resolution, etc. based on the memory available on the compute so that the training job doesn’t result in out of memory.
Experiments
We performed extensive experiments, but for brevity, we chose to present only the below datasets for classification and object detection. Automode was also compared with two competitor platforms. We chose datasets that have varying characteristics in terms of number of classes, object size, image resolution, etc. All the results reported are on the test dataset.
Object Detection
For this task we selected the following datasets:
- VisDrone 2019: Dataset containing images/videos from the drones.
- KITTI: Street view images dataset used in autonomous driving.
- VOC 2012: Dataset containing objects from realistic scenarios from a variety of object types.
- Safety Helmet: Consists of images of personal safety equipment.
From figures 1–4 above, we can conclude that as we increase the budget, Automode improves the model’s mAP but plateaus later. By increasing the budget from 7$ to 141$, mAP increased by 13% for VisDrone 2019 and by increasing the budget from 5$ to 52$, an increase of 4% was seen for VOC dataset. This can be achieved without any time or effort needed from the data scientists or engineers.
For comparisons with the competitors, a budget was set to achieve the best metric for the competitor.
Automode significantly outperforms competitor 1 given the same budget whereas for competitor 2, Automode is behind ~1% for Safety Helmet and 0.1% for VOC dataset given the same budget but outperforms for VisDrone and Kitti datasets.
Image Classification
For this task we selected the following datasets:
- MIT Indoors: Dataset for indoor scene recognition.
- Deep Fashion: Large scale clothing dataset for e-commerce.
- Deep Weeds: Dataset containing images of weed species.
Like object detection, we see that accuracy of the models increased by ~2% by increasing the budget from 3$ to 62$ for Deep Fashion and 1$ to 36$ for MIT Indoors for image classification task.
When compared with competitors, Automode significantly outperforms competitor 1 given the same budget whereas for competitor 2, Automode is behind 0.2% for Deep Weeds but outperforms for MIT Indoors and Deep Fashion datasets.
Conclusion
We found that Automode in AutoML for Images is a feature that greatly simplifies the life of data scientists and ML engineers by offering high quality models and eliminating the need for manual hyperparameter tuning. Please give it a try and let us know what you think!
Sample Notebooks
Image Object Detection
Image Multiclass Classification
Image Multilabel Classification
Image Instance Segmentation
CLI Examples
Image Object Detection
Image Multiclass Classification
Image Multilabel Classification
Image Instance Segmentation
Resources
- AzureML: Set up AutoML for computer vision
- Tutorial: AutoML- train object detection model
- AzureML: Train small objects in images
- AzureML: Prepare data for computer vision tasks
- Azure AutoML for Images: Baseline and beyond for Computer Vision models
Special thanks to Mercy Ranjit, Rupal Jain, YiYou Lin!