Open Sourcing Atlas

Dessa
Dessa News
Published in
3 min readMar 18, 2020

Self-Hosted Tools For Applied Deep Learning Development

Just a month after joining Square, we’re excited to announce the open source launch of Atlas, our platform for developing applied deep learning projects.

In Atlas’ GUI, you can view artifacts like images and sound files easily for all the experiments you’re running.

What is Atlas?

Atlas helps machine learning practitioners take projects from 0 to 100 fast, with features that make it easy to run, evaluate and deploy thousands of experiments concurrently. We first built Atlas for ourselves, when few tools existed for tackling some of the bigger challenges specific to deep learning:

  • Successful deep learning projects require large-scale experimentation. Without the right tools, it is time-consuming and difficult to accurately track experiments.
  • The lack of tools for experiment management meant that a lot of this work was done ad hoc, making it hard to collaborate and reproduce model results.
  • Deep learning requires lots of compute, and we found ourselves spending lots of time rigging up the right infrastructure — valuable time that could have been spent experimenting!

These challenges prevented us from getting the full impact we knew our deep learning projects could have, and we were sure that other practitioners were facing the same kinds of roadblocks.

Last July, we released the first version of Atlas to the public, which has since been downloaded by thousands of machine learning practitioners. By open-sourcing the code, we’re excited to invite users to contribute to making Atlas a tool that makes time to results with deep learning even faster.

Features

Atlas is part of Foundations, our end-to-end platform for applied deep learning development and production. It’s a flexible machine learning tool that consists of a Python SDK, CLI, GUI & Scheduler.

SDK:

CLI:

GUI:

  • Job scheduling: Atlas’ job scheduler handles experiment management for you, running 1000s of experiments concurrently and asynchronously.
  • Experiment management & tracking: Tag experiments and easily track hyperparameters, metrics, and artifacts such as images, GIFs, and audio clips in a web-based dashboard.
  • Reproducibility: Every job run is recorded and tracked using a unique job ID, making it easy to reproduce and share any experiment.
  • Python SDK: An easy-to-use SDK makes Atlas compatible with any machine learning library or framework. The SDK also allows you to do hyper-parameter optimization runs programmatically.
  • Built-in Tensorboard integration: Directly compare multiple job runs on Tensorboard within the Atlas GUI.
  • Keycloak integration: Set up your authentication server easily and manage access to your DL cluster with Atlas’ built-in Keycloak integration.
  • Self-hosted: Run Atlas on a single node e.g. your laptop, or multi-node cluster e.g. on-premise servers or cloud clusters (AWS/GCP/etc.)

Get started

Find the Atlas repository on Github and read the docs to install it on your local machine. We’ve also included an image segmentation tutorial to make it easy to get started.

Contribute

Now that Atlas is open source, we’re excited for external contributions from the community to the code! If you’re interested in creating a pull request, follow the guidelines on Github here.

Get in touch

Have a question about Atlas or want to chat more about how to contribute? We would love to chat! Join our community Slack and get in touch with our machine learning engineering team directly.

--

--

Dessa
Dessa News

A (not-so) secret AI lab at Square. Learn more at dessa.com