Codeless Deep Learning Pipelines with Ludwig and Comet.ml

How to use Ludwig and Comet.ml together to build powerful deep learning models right in your command line — using an example text classification model

Published in

Comet

5 min readJul 24, 2019

Ludwig is a TensorFlow-based toolbox that allows users to train and test deep learning models without the need to write code.

By offering a well-defined, codeless deep learning pipeline from beginning to end, Ludwig enables practitioners and researchers alike to quickly train and test their models and obtain strong baselines to compare experiments against.

“Ludwig helps us build state of the art models without writing code, and by integrating Ludwig with Comet, we can track all of our experiments in a reproducible way, gain visibility, and a better understanding of the research process.” — Piero Molino, Senior ML / NLP Research Scientist at Uber AI Labs and Creator of Ludwig

Ludwig offers CLI commands for preprocessing data, training, issuing predictions, and visualizations. In this post, we’ll show you how to use Ludwig and track your Ludwig experiments with Comet.ml.

See the Ludwig Github repo here

Want an image captioning model or visual question answering model quickly? Use Ludwig in these 4 easy steps to build, train, and evaluate deep learning models.

Here at Comet.ml, we were excited by the potential for Ludwig to fill a void in the machine learning ecosystem. Ludwig finally takes the idea of abstract representations of machine learning models, training, data, and visualizations and turns them into a seamless, executable pipeline from start to finish.

This means that we can finally spend less time:

dealing with data preprocessing for different data types ☠️
meshing different model architectures just to get simple baseline models
writing code to make predictions

and more time:

getting transparent results 🚀

Integrating Comet with Ludwig

We worked with the Ludwig team to integrate Comet.ml so that users can track Ludwig-based experiments live as they are training.

There are three main areas where Comet.ml complements Ludwig:

Comparing multiple Ludwig experiments: Ludwig makes it easy for you to train and iterate through different models and parameters sets. Comet provides an interface to help you keep track of the results and details across those different experiments.
Organized store for your analysis: Ludwig allows you to generate cool visualizations around the training process and results. Comet allows you to keep track of those visualizations and automatically associates them with your experiments instead of having to save them somewhere.
Meta-analysis for your experiments: You’ll probably do multiple iterations of your Ludwig experiments. Tracking them with Comet enables you analyze things like which parameters work in order to build a better model.

By running your Ludwig experiment with Comet.ml, you can capture your experiment’s:

code (the command line arguments you used)
live performance charts so you can see the model metrics in real-time (as opposed to waiting until training is done)
visualizations you created with Ludwig
environment details (eg. package versions)
history of runs (HTML tab)

…and more!

Running Ludwig with Comet

Install Ludwig for Python (and spacy for English as a dependency since we’re using text features for this example). This example has been tested with Python 3.6.

$ pip install ludwig
$ python -m spacy download en

If you encounter problems installing gmpy please install libgmp or gmp. On Debian-based Linux distributions: sudo apt-get install libgmp3-dev. On MacOS : brew install gmp.

2. Install Comet:

$ pip install comet_ml

3. Set up your Comet credentials:

Get your API key at https://www.comet.ml
Make that API key available to Ludwig and set which Comet Project you’d like the Ludwig experiment details to report to:

$ export COMET_API_KEY="..."
$ export COMET_PROJECT_NAME="..."

4. We recommend that you create a new directory for each Ludwig experiment.

$ mkdir experiment1
$ cd experiment1

Some background: every time you want to create a new model and train it, you will use one of two commands —
— train
— experiment
Once you run these commands with the--comet flag, a .comet.config file is created. This .comet.config file pulls your API key and Comet Project name from the environment variables you set above.

If you want to run another experiment, it is recommended that you create a new directory.

5. Download the dataset. For this example, we will be working on a text classification use case with the Reuters-21578 , a well-known newswire dataset. It only contains 21,578 newswire documents grouped into 6 categories. Two are ‘big’ categories (many positive documents), two are ‘medium’ categories, and two are ‘small’ categories (few positive documents).

Small categories: heat.csv, housing.csv
Medium categories: coffee.csv, gold.csv
Big categories: acq.csv, earn.csv

$ curl http://boston.lti.cs.cmu.edu/classes/95-865-K/HW/HW2/reuters-allcats-6.zip -o reuters-allcats-6.zip
$ unzip reuters-allcats-6.zip

6. Define the model we wish to build with the input and output features we want. Create a file named model_definition.yaml with these contents:

input_features:
    -
        name: text
        type: text
        level: word
        encoder: parallel_cnnoutput_features:
    -
        name: class
        type: category

7. Train the model with the new --comet flag

$ ludwig experiment --comet --data_csv reuters-allcats.csv \
    --model_definition_file model_definition.yaml

Once you run this, a Comet experiment will be created. Check your output for that Comet experiment URL and press on that URL.

8. In Comet, you’ll be able to see:

your live model metrics in real-time on the Charts tab
the bash command you ran to train your experiment along with any run arguments in the Code tab
hyperparameters that Ludwig is using (defaults) in the Hyper parameter tab

and much more! See this sample experiment here

Running our example text classification Ludwig experiment with the — comet flag. You can interact with this Comet experiment here

If you choose to make any visualizations with Ludwig, it’s also possible to upload these visualizations to Comet’s Image Tab by running:

$ ludwig visualize --comet \
    --visualization learning_curves \
    --training_statistics \
    ./results/experiment_run_0/training_statistics.json