Aim basics: using context and subplots to compare validation and test metrics

Gev Sogomonian
Feb 16 · 2 min read

Researchers divide datasets into three subsets — train, validation and test so they can test their model performance at different levels.

The model is trained on the train subset and subsequent metrics are collected to evaluate how well the training is going. Loss, accuracy and other metrics are computed.

The validation and test sets are used to test the model on additional unseen data to verify how well it generalise.

Models are usually ran on validation subset after each epoch.

Once the training is done, models are tested on the test subset to verify the final performance and generalisation.

There is a need to collect and effectively compare all these metrics.

Here is how to do that on Aim

Using context to track for different subsets?

Use the aim.track context arguments to pass additional information about the metrics. All context parameters can be used to query, group and do other operations on top of the metrics.

Here is how it looks like on the code

Once the training is ran, execute aim up in your terminal and start the Aim UI.

Using subplots to compare test, val loss and bleu metrics

Note: The bleu metric is used here instead of accuracy as we are looking at Neural Machine Translation experiments. But this works with every other metric too.

Let’s go step-by-step on how to break down lots of experiments using subplots.

Step 1. Explore the runs, the context table, play with the query language.

Explore the training runs

Step 2. Add the bleu metric to the Select input — query both metrics at the same time. Divide into subplots by metric.

Divide into subplots by metric

Step 3. Search by context.subset to show both test and val loss and bleu metrics. Divide into subplots further by context.subset too so Aim UI shows test and val metrics on different subplots for better comparison.

Divide into subplots by context / subset

Not it’s easy and straightforward to simultaneously compare both 4 metrics and find the best version of the model.

Summary

Here is a full summary video on how to do it on the UI.

Learn More

If you find Aim useful, support us and star the project on GitHub. Join the Aim community and share more about your use-cases and how we can improve Aim to suit them.

AimStack

A super-easy way to record, search and compare 1000s of AI training runs.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store