Democratizing AI to Accelerate ML Model Development in Weeks vs. Months

Published in

Intuit Engineering

6 min readMay 4, 2023

As a global financial technology platform company, accelerating machine learning (ML) model development is key to Intuit’s ability to deliver personalized AI experiences at scale to more than 100 million consumer and small business customers with TurboTax, Credit Karma, QuickBooks and Mailchimp.

Over the past couple of years we’ve set out to “democratize AI” by enabling product developers to build, deploy and monitor highly performant models — and to do so faster with a no-code, self-serve solution. Now, Intuit technologists are building AI capabilities into products as easily as any other feature or component, whether or not they have specialized knowledge of AI.

Our results tell us we’re on the right track: we’ve reduced model development times from 3 to 4 months to 2 weeks or less, on average. For example:

TurboTax Online: an Intuit data analyst with limited ML experience reduced model building time from 1 week to 5 hours.
QuickBooks Online: an Intuit data scientist reduced model development and deployment time from 2 months to 4 days.
QuickBooks Live: an Intuit software engineer was able to reduce model development and deployment time from 4 to 6 weeks to 4 days.

To accomplish this we’ve built an internal AI Marketplace where software developers, data scientists, machine learning engineers, and data analysts alike can tap into AI-ready components that can be applied to power multiple use cases. Within the marketplace, technologists can also consume reusable AI-native models and reusable AI services to build their own models within Intuit’s responsible AI and data governance guidelines and guardrails.

Operationalizing AI with MLOps and AutoML

Like many companies, Intuit has been on a journey to realize the full benefits of AI by applying tools and methodologies to help manage data, models, deployment and monitoring. Among the core elements of our AI Marketplace is the Model Studio, where we automate various phases of MLOps, leveraging AutoML and Kubeflow pipelines to accelerate model development.

Before delving into each step within our model flow, let us examine MLOps and AutoML.

MLOps — MLOps is a set of practices that aims to operationalize model development, deployment and maintenance in production, reliably and efficiently, similar to DevOps for software development. The cycle of explore -> experiment -> featurize -> train -> validate -> test -> review -> deploy -> infer -> monitor -> retrain is automated to ensure quality, reproducibility and reliability. At Intuit, we also adhere to a set of responsible AI and data governance practices within our MLOps flow.

AutoML — AutoML aims to provide non-experts with automated tools that can be easily used for model development and deployment. Technologists can run experiments on a data set to train a model, optimize it with a performance metric [e.g., AUC (area under the receiver operating characteristic curve)] and apply constraints (cost or duration of experiment). The output of experimentation is a leaderboard with various models ranked on the performance metric. If the top-ranked model meets the criteria for model performance, they can proceed to next steps; otherwise they return to the input dataset to make changes, or adjust the configuration to begin a new experiment.

How does Intuit’s AI Marketplace enable AI democratization?

Within Intuit’s AI Marketplace Model Studio, a technologist can onboard a new use case by providing the dataset location and the type of problem they’d like to solve. After that, they can trigger workflows to run experiments, iterating as many times as needed to create a quality model that’s ready for deployment.

Data Validation workflow — The first workflow inspects the dataset, checking for access permissions and data quality. At this stage, the user receives feedback, enabling them to go back and fix any issues in the dataset, if needed, before proceeding to the next step for model training. Warnings provided are for informational purposes and the training engine can overcome those without consequence. However, error messages require action, blocking the user from proceeding.

**Data Validation Warning & Error Message Examples**

Model Training and evaluation workflow — This workflow provisions the AutoML engine from the resource pool, according to the type of problem and size of the dataset. It then runs experiments on user data before generating the leaderboard with top-ranked model candidates.

Users get detailed reports on every aspect of the experiment (e.g., hyperparameters, final features contributing to the model), along with a quick view of highlights in the UI. For example, a summary of a binary classification model for wine quality would look like this!

At this stage, a user can select the best model for their use case and proceed to the next step, after the model has met quality criteria set in the guardrails. If for any reason the model doesn’t meet criteria, they can return to the previous steps to re-work the data set or experiment.

Model approval and deployment workflow — After the user has selected the optimal model for their use case, they must enlist stakeholders in a workflow for verification and approval of the model, which must meet business requirements and adhere to responsible AI practices for safe deployment to production. As needed, data scientists are available to provide advice and support (e.g., better quality datasets, additional experimentation). Upon completion, the model is deployed for serving on Intuit’s Machine Learning Platform and instrumented for monitoring to detect drift and alerting.

The development platform behind our solution

Our AI Marketplace Model Studio leverages Intuit’s paved roads for building end-to-end experiences for technologists:

Workflows for data validation, model training and model deployment have multiple phases. Within each phase, the state of the experiment is maintained, giving the user the ability to: 1) retry and resume in case of failure or 2) take action in case of success.
Reusable workflows are powered by pipeline templates deployed in KubeFlow for reporting status updates to Model Studio APIs. These APIs maintain and store updates, enabling users to track progress within their workflow at any time, via the API or UI. Conventions and standards are built into the workflows, requiring minimal user input.
The component-based architecture of Kubeflow pipelines makes the solution extensible. For example, new AutoML training engines or fine tuning solutions can be added to the AutoML resource pool; new data validation components can be added for different data types to the validation workflow (e.g., text, images); and new model quality checks can be added as components, depending on the type of models.

Building for scale!

We’ve built Model Studio to scale within our AI Marketplace, applying a set of durable principles to streamline model development for product teams across our global organization:

End-to-end workflow, leveraging automation
Self-serve, no code solution, with an intuitive UI
Built-in guardrails to keep the bar high on data quality, model quality and Responsible AI practices.

Our Model Studio in AI Marketplace has revolutionized the way Intuit product teams operate. MLOps automation speeds up model training with AutoML, and Kubeflow pipelines simplify deployment for inference, retraining and monitoring. By putting foundational building blocks in the hands of thousands of Intuit technologists to create AI-powered products with greater efficiency, we’re getting to market faster with personalized experiences for our consumer and small business customers.

Democratizing AI in a GenAI world

We’re proud of our accomplishments to date and excited about what the future holds as we continue our journey. In the coming weeks, we’ll enhance our AI Marketplace with a new generative AI lab environment. This will enable our global technology organization to more easily experiment with generative AI models from multiple third parties and Intuit’s own foundational language models.