How Second Generation AutoML Will Replace Software Development from the Inside Out
I just returned from a few weeks in Europe speaking at various AI and ML conferences. Notwithstanding our many posts on the topic, I actually don’t like to give talks about why Auger.AI is the best AutoML product. Instead what I have been talking about is where I see all of this innovation in AutoML heading.
“First generation” Automated Machine Learning was focused on the “citizen data scientist”. Business analysts have something they need to predict or classify. They upload their spreadsheet to a service. The service provides a “leaderboard” of winning algorithms and a bunch of supplementary widgets to understand the trained models in more detail. From that the business analyst makes a choice about what model they wish to deploy for predictions. This first generation AutoML provided real value in both getting non-technical users to see the value of machine learning in general, and to get better accuracy given a decision to use machine learning for better predictions.
But the potential of machine learning goes far beyond the business analyst. Other AutoML providers are starting to see these possibilities. Without going out of their way to point out their developer-focused nature and what it enables, both Google’s and Microsoft’s AutoML offerings are indeed very programmer-oriented. There’s not a leaderboard in sight.
We at Auger.AI believe that such full automation of Automated Machine Learning represents a “second generation of AutoML” and enables a new class of machine learning projects. Instead of working on big multi-week (or multi-month) machine learning projects with teams of data scientists, small micro-decisions in every important enterprise application can be automated and determined with AutoML. This is a much bigger potential change that making what is deemed by humans to be the most pressing predictive problem.
AI is indeed eating software. As Marc Andreessen pointed out in 2011, Software Is Eating the World. That transition has already largely happened. Every piece of such software will become intelligent. Instead of programmers writing hundreds of lines of code with if-then-elseif-else statements and switch-case statements, those judgments can be made with small predictive models. Instead of users of software wading through nested menus to find their actions (e.g. choosing the account to contact next in their CRM, determining the best indicator in a medical diagnosis) the software will present the most likely choices for user judgment. In other contexts, given sufficient accuracy and depending on the problem domain decisions can be made in completely automated fashion.
Auger.AI, Google AutoML Tables, and Microsoft Azure AutoML’s API are very focused on this “Automated AutoML” scenario. They actually have more in common than they differ. They all follow what we refer to as the PREDIT pipeline:
- Import data into the AutoML infrastructure
- Train an “AutoML experiment”: try dozens of algorithms and hundreds of hyperparameter settings to find the best model
- Evaluate the best model
- Deploy the model
- Predict new targets based on newly encountered data
- Review real world performance of the model
It’s hard to remember “ITEDPR” so we concocted a corny yet memorable anagram for it: the PREDIT (French for “predict”) Pipeline. All the steps in the full PREDIT pipeline (automatically bringing in data, training, deploying, predicting, and reviewing model performance, followed by reimporting and retraining of the data) are essential in any application that wants to take maximum advantage of AutoML without a constant “citizen data scientist” in the loop.
We have introduced a common open source API layer that reflects this common “Automated AutoML” PREDIT pipeline. We call it A2ML for “Automated AutoML”. A2ML supports Auger.AI and Google AutoML Tables today [Note: we are working on full support for Azure AutoML]. Google and Azure’s native AutoML APIs mentioned are very detailed in their capabilities but are at least 20 times more verbose in usage due to being “add-ons” to the core cloud capabilities of Google Cloud and Microsoft Azure. We also believe that “replacing software logic with ML” is such a fundamental enough shift that it needs a company and dedicated product focused on it (not a cloud service add-on), but that is itself the topic of another post.
In summary, we believe that common APIs for AutoML available as either a formal or de facto standard through widespread adoption will accelerate this process of “AI Eating Software”: providing better applications for businesses and users and drastically increasing developer productivity. Subsequent posts here will discuss the A2ML API in much more detail.