Breaking the Magicians’ code with IBM Watson Studio’s AutoAI Notebooks

Yair Schiff
IBM Data Science in Practice
5 min readApr 15, 2021

It’s one of the most famous codes of conduct: “Magicians never reveal their secrets.” But there are rare moments when an audience is so wowed, so struck by the ‘what just happened’ feeling that it would be almost cruel not to let them in on the secret.

At IBM Watson® Machine Learning, we couldn’t help but show you how we pulled the rabbit pipeline out of the magical data hat, and we are proud to announce the General Availability of the AutoAI Notebooks feature.

With this feature, we are inviting AutoAI users, mystified by its machine learning magic, to peek behind the award winning user interface and interact directly with the APIs and code that power the AutoAI engine. The AutoAI Notebooks are comprised of automatically generated Python code and API calls that offer users a guided tour of all the automated steps AutoAI executes in creating state of the art machine learning pipelines. Exposing the APIs and steps gives users the ability to programmatically interact with and customize their AutoAI pipelines and experiments.

For those readers new to Watson Machine Learning, AutoAI is an automated machine learning tool that is fully integrated within Watson Studio on Cloud Pak for Data™. AutoAI does in minutes what would typically take hours to days for whole teams of data scientists. This includes data preparation, model development, feature engineering, and hyperparameter optimization. Take a moment to learn more about IBM’s AutoAI at work: two real-world applications.

Visit Watson Studio today to get started and experience this product for yourself!

Hop on the Magic School Bus

At IBM Watson Studio, we know that if data scientists were Kareem Abdul-Jabbar, then Notebooks would be their Sky Hook. So if you’ll indulge the 90’s kid in us for a moment, we’d love to do our best Ms. Frizzle impression for you and provide a tour of the AutoAI Notebooks that were recently made generally available, highlighting some of the key features and customization options available.

Let’s start with some sample data. In this experiment we’ll be using a dataset with various automobile information to predict insurance riskiness categories. Read more about the dataset and download instructions.

Our first step is to add this data to a Project and run an AutoAI experiment:

Animation of a user setting up an AutoAI multi-class experiment.
Animation of an AutoAI experiment ingesting data and creating pipelines.
Setting up (left) and running (right) the AutoAI multi-class experiment

Pipeline Notebooks

Here’s the part where you say, “but wait, how did they do that?” Well, let us show you. The first type of Notebook that is surfaced to users, can be found in the pipeline leaderboard:

Animation of a user saving the leading pipeline on a leaderboard as a ‘pipeline’ Notebook.
Saving a pipeline notebook for the top performing pipeline

For each pipeline generated by AutoAI, we provide users with a step-by-step guide for the code to ingest data, split training and test sets, and build train, and score the pipeline. This enables the curious data scientist to see what made their winning pipeline (or any other generated pipeline for that matter) work, and with access to the source code, users can customize the pipelines to their needs.

Animation of a user scrolling through the various sections of the pipeline Notebook, including data ingestion, pipeline composition, pipeline fitting, and scoring.
Using the pipeline Notebook to see the components of a pipeline, re-train it, and test it

Experiment Notebooks

But we didn’t want to stop there. Once the data for an experiment has been preprocessed, we also give users insight into the entire AutoAI process with a Notebook that guides them through the full experiment!

Animation of a user saving the ‘experiment’ Notebook.
Saving the experiment Notebook

This notebook has several important sections. The first section exposes Python code and APIs for reviewing the experiment that was just launched. In this section, users can load details about the experiment and view summary tables and graphs of the generated pipelines.

Animation of a user scrolling through the inspection of the completed experiment in the experiment Notebook, including a summary of metrics from the tested pipelines
Inspecting the completed experiment using the experiment Notebook

Next is a section that allows users to inspect any of the pipelines from the experiment, with the fantastic pipeline visualization built on the Lale package, giving users a much more tangible feeling of what comprises each pipeline generated by AutoAI. Users can also click on any node in the pipeline visualization to read documentation about that node’s transformer:

Animation of a user viewing the Lale depiction of a pipeline in the experiment Notebook and selecting a specific node to view its documentation.
Visualizing a pipeline with the Lale package in the experiment Notebook and opening documentation about a specific transformer

To take full advantage of Watson Studio, we also provide instructions and code for integrating this experiment into other data science lifecycle offerings. For example, in the image below, we display how to deploy a pipeline as a REST API that returns predictions online:

Screenshot of a user deploying a pipeline as a REST API scoring service in the experiment Notebook.
Deploying a pipeline as a REST API using the experiment notebook

Finally, with last section in the notebook, users can run the entire AutoAI experiment, from data ingestion to optimal pipeline selection, using a single API call, which gives users the ability to integrate AutoAI directly into their other workflows:

Animation of a user scrolling through the SDK section of the experiment Notebook.
Animation of a user running the AutoAI experiment using an API call in the experiment Notebook.
Using the API exposed in the experiment notebook to run the entire AutoAI experiment

Both of the Notebooks described in this post render the power of AutoAI much more accessible to users who want further customization and the ability to share their work and integrate source code into their other projects.

To experience AutoAI and Notebooks, visit Watson Studio.

Happy modeling!

--

--