Five Tips For Your Next AI Project

Published in

back to the napkin

5 min readJun 16, 2021

Machine learning (ML) and artificial intelligence (AI) continue to claim a large share of corporate investment due to high business potential. Revenue generated by AI hardware, software, and services was estimated to be $156.5B worldwide in 2020, according to market researcher IDC, up 12.3% from 2019. However, a majority of AI initiatives still end up stalling. Pactera claims that a staggering 85% of AI projects end up failing to deliver. At Dialexa, we have seen how fragile these important projects can be firsthand, and put together a few tips to improve success rates.

Understand the data and objectives
Use both business and model metrics
Develop a baseline model quickly
Establish the development processes early
Evaluate project health and risk often

1. Understand the data and objectives

Exploratory data analysis and problem discovery are extremely important to identify potential risks and opportunities for projects. Machine learning projects can be expensive. It may be tempting to cut corners. However, these exploratory tasks should not be skipped, since the benefits typically outweigh the expense. Things to pay attention to:

Data quality is important to review to determine if machine learning is the right solution. Is there enough volume? How often are values missing? Is there metadata or resources explaining what the data means? Are you missing any data needed to address the objective?
Data leakage is an expensive problem that often sneaks by unnoticed until trying to use a model in production. Analyzing the data enough to develop leakage prevention strategies can often save your project.
Problem formulation helps inform the requirements of the model. What problem is being solved? How should the model work? How will the user interact with it? What are the desired outputs? How do we know the model is working well? Does the problem fit into typical machine learning objectives within supervised learning, unsupervised learning, or reinforcement learning?

2. Use both business and model metrics

Business metrics help quantify the impact of your model on your organization. When designed well, they can quickly justify the need for the project, based on benefits and progress. Business metrics are often dollar or time estimates that require multiple assumptions which lack the precision needed to evaluate models.

Model metrics measure the performance and robustness of your model on a dataset. They serve as a feedback mechanism during the model training phases, while also identifying the highest quality model across experiments. Model metrics are often relative to the data and fail to provide the simplicity and objectivity required to communicate business impact.

Business and model metrics complement each other to provide a comprehensive view of the project. Using both types of metrics enables you to evaluate the models and justify the investment. For example, on a customer churn prediction project you may use log loss or f1 score as a metric to train and compare models. Then, leveraging some assumptions, you can use impact to lifetime customer value estimates as a business metric. These metrics are all derived from model performance and they can effectively be tracked across experiments, with new data and changes in assumptions throughout the project.

3. Develop a baseline model quickly

Competition sparks innovation. It’s also useful when training models. It is essential to establish a baseline as soon as possible, which serves as a starting point from which to improve, and a central axis, around which you can design the remaining product components. It also helps deter teams from diving straight into an overly complicated solution. If you are lucky enough, it may even reveal that machine learning isn’t necessary at all and that a simple rules-based engine can solve the problem.

A good baseline model is as simple as possible. It doesn’t take a long time to implement and is easy to understand. Examples include:

Returning the majority class in classification problems
Using an average value or linear regression for continuous outputs
Pretrained models for complex deep learning tasks such as computer vision

4. Establish the development processes early

Machine learning is still relatively new in software products. As such, many best practices used in software are still foreign to machine learning workflows. However, taking time to set up these processes early in your project will reap benefits in the long term.

Testing code and workflows in machine learning is hard, but catching errors later is harder. Many models are difficult to explain which makes it harder to monitor over time. Check out Jeremy Jordan’s explanation of why we need tests in machine learning systems and his guidelines for test implementation.
Logging during code execution saves time and money for projects. During training, log events, parameters, and experiment results to avoid duplicating training jobs and losing some of the rich and expensive information produced along the way. Proper error handling during model requests can help catch and debug problems quickly. The great thing is that it doesn’t take a lot to set up logging for your project and there are plenty of helpful resources available to help.

5. Evaluate project health and risk often

Don’t be afraid to fail but also don’t trap yourself in failure.

Many problems can’t be solved with machine learning. Don’t feel like your first endeavor needs to be a complete success. There are likely many other AI opportunities in your organization that could benefit from exploration so don’t feel like your first endeavor needs to a complete success. It is often best to focus on easy wins and value first when starting a machine learning program before trying to tackle a complicated problem.

AI projects typically expose other opportunities or requirements as well. You may find data governance or data collection opportunities to bolster up the organization’s data long-term. The business case may not be strong enough yet to advocate for a machine learning implementation and identifying these obstacles early can limit sunk costs.

AI projects are challenging, rewarding, and incredibly important for today’s organizations. Following these tips won’t guarantee success, but they will mitigate many of the major pitfalls for AI projects. I hope they help you in your initiatives and look forward to seeing the next wave of data-driven products.