Aim 3.0.0 — The foundations for open-source & open-metadata ML platform

Gev Sogomonian
Oct 22, 2021 · 3 min read

The origins of Aim

In the fateful summer of 2020, our friend mahnerak was hitting the limits of Tensorboard and wasn’t going to send the training logs to a third-party cloud. He’s a researcher at a non-profit lab and just wanted to focus on his actual research but would spend hours on Tensorboard.

Gor and I started hacking on an open-source library to store metrics and hyperparameters. In a month, mahnerak was using Aim 1.0 to track, store, search and group his metrics.

By fall 2020, Aim 2.0 launched as a free, open-source and self-hosted alternative to Weights and Biases, Tensorboard and MLflow. To our surprise even r/MachineLearning loved it.

By spring 2021, mahnerak co-authoerd a paper WARP (code): Word-level Adverserial ReProgramming on his ACL-published work. At that point Aim users had contributed over 100 feature requests already.

A scale problem

But Aim’s power users — who often do 5K+ runs — were hitting issues.

After over 250 pull requests, 1.2K GitHub stars and 200 feature requests — live updates, image tracking, distribution tracking… — Aim 2.0 was hitting the limits of Aim 1.0’s design.

To support the future, we had to make changes to the foundation now.

Launching Aim 3.0.0

An additional 317 pull requests later, we are excited to launch Aim v3.0.0 !!!

Here are the most important changes:

  • Home page and run detail page
  • Runs, metrics and params explorers
  • Bookmarks and Tags

  • New and much more intuitive (but still quite vanilla) API to track your training runs
  • New and 10x faster embedded storage based on Rocksdb. It will allow us to store virtually any type of AI metadata (as opposed to AimRecords that was specifically designed for metrics and hyperparams)

Enjoy the changes!

Performance improvements

  • Average run query execution time on ~2000 runs: 0.784s.
  • Average metrics query execution time on ~2000 runs with 6000 metrics: 1.552s.
  • New UI works smooth with ~500 metrics displayed at the same time with full Aim table interactions (for comparison, v2 was performant with limitation for only 100 metrics).

Aim Roadmap

With this version we are also publishing the Aim roadmap for the next 3 months.

This is a living document and we hope that the community will help us shape it towards supporting the most important use-cases.

We are also inviting community contributors to help us get there faster!

Why are we building Aim?

We have started to work on Aim with strong belief that the open-source is in the DNA of AI software (2.0) development.

Existing open-source tools (TensorBoard, MLFlow) are super-inspiring for us.

However we see lots of improvements to be made. Especially around issues like:

  • ability to handle 1000s of large-scale experiments
  • actionable, beautiful and performant visualizations
  • extensibility — how easy are the apis for extension/democratization?

We are inspired to build beautiful and performant AI dev tools with great APIs.

Our mission…

Aim’s mission is to democratize AI dev tools. We believe that the best AI tools need to be:

  • open-source, open-data-format, community-driven
  • have great UI/UX, CLI and other interfaces for automation
  • performant both on UI and data
  • extensible — enable ways to build around for so many use-cases

Thanks to

Ruben Karapetyan for being the first to believe in this project and spending lots of his time and setting the foundations for the beautiful UI.

Mahnerak for sharing his problems and continuously testing and coming up with better solutions on UX, features. Also for helping us build the next-gen storage for Aim.

Aim users Mohammad Elgaar, Vopani for continuous feedback on our work.

The contributors who have been relentlessly iterating over the course of the summer.

On to the next generation of ML tools!!

Join Us!

Join the Aim community, test Aim out, ask questions, help us build the future of AI tooling!

If you find Aim useful, drop by and star the repo ⭐

AimStack

Aim logs your training runs, enables a beautiful UI to compare them and an API to query them program