Data-Driven Leadership and Careers

12 Steps to Applied AI

A roadmap for every machine learning project

Cassie Kozyrkov
The Startup
Published in
9 min readDec 9, 2019


For those who’ve been looking for a 12 step program to get rid of bad data habits, here’s a handy applied machine learning and artificial intelligence project roadmap. Well, it should properly be 13 steps, so we’ll start counting at zero to make it work.

(All links take you to articles by the same author.)

In practice you’ll need to iterate and backtrack plenty, BUT you should never start a step without at least attempting each of the previous ones.
For those who like video summaries, here is a video summary from my course.

Step 0: Reality check and setup

Check that you actually need ML/AI. Can you identify many small decisions you need help with? Has the non-ML/AI approach already been shown to be worthless? Do you have data to learn from? Do you have access to hardware? If not, don’t pass GO.

Pro tip: besides looking like a bunch of rampaging amateurs, leaders who try to shove AI approaches where they don’t belong usually end up with solutions which are too costly to maintain in production. Instead, find a good problem to solve and may the best solution win. If you can do it without AI, so much the better. ML/AI is for those situations where the other approaches don’t get you the performance you need. It’s useful and it’s here to stay, but it’s not for everything.


Step 1: Define your objectives

Clearly express what success means for your project. Your ML/AI system is going to produce a bunch of labels for you: how will you score its performance on the task you set it? How promising does it need to be in order to be worth productionizing? What’s the minimum acceptable performance for it to be worth launching?

Pro tip: make sure this part is done by whoever knows the business best and has the sharpest decision-making skills, not the best equation nerdery. Skipping this step or doing it out of sequence is the leading cause of data science project failure. Don’t. Even. Think. About. Skipping. It.


Step 2: Get access to data

Create the process and code for collecting instance IDs and some features that go with those IDs. You’ll also need the correct labels if you’re doing supervised or semi-supervised learning — in practice, these are often made by humans performing the task over and over.

Pro tip: consider a dress rehearsal with simulated data before purchasing data or going out into the real world to collect your own.


Step 3: Split your data

Set some of your data aside so that you have the opportunity to check how well your pattern-based recipes work outside the data your found them in. It’s crucial that you evaluate performance where it matters: on fresh, relevant data you haven’t used for anything else.

Split your data into 3 datasets: training, validation, and test. (You’ll later split your training dataset further into two pieces for model fitting and debugging, but don’t worry about that just yet.)

Pro tip: implement splitting at the infrastructure level if you can and have tight access control so your test data don’t get misused accidentally.


  • Learn why we split our data.

Step 4: Explore your data

It’s time for analytics! Look at some (not all!) of your data. Use your training dataset to plot data, complete sanity checks, and engineer new features. Never forget that real world data are messy, so trust no one and trust nothing. Instead, think of your dataset as a textbook you’re using to teach your machine students. Only a daft teacher assigns a textbook they haven’t looked inside.

Pro tip: don’t forget to apply the code you write to clean your data and create new features to your validation and test datasets… without poking around in them.


  • Articles about the nature of analytics: [1], [2], [3], [4]
  • Exploration helps you fight AI bias.

Step 5: Prepare your tools

This is where you make friends with your ML/AI toolbox and get to know all the pattern-finding algorithms you’re going to try running. Don’t expect your data to be in a format those packages will accept — you’ll likely need to do a bunch of setup and code wrangling to get those algorithms to accept your data.

Pro tip: always try running existing packages before you let yourself even think about reinventing a wheel. This is the opposite of the instinct you’re taught in AI classes aimed at researchers (whose job involves inventing new wheels), so prepare to battle your own habits if you’re the academic type.


  • The important difference between AI research and applied AI.
  • How do ML/AI algorithms work?
  • Everyone seems to be talking about TensorFlow, but what is it?

Step 6: Use your tools to train some models

Find and exploit patterns in your data to make recipes. Split your training data and run some of it through the algorithms you prepared in Step 4 to fit some candidate models by finding patterns and turning those patterns into recipes. Evaluate performance on the rest of your training data. Tinker as much as you like, iterating in the direction of the more promising algorithms and backtracking to prep their cousins to receive your data.

Pro tip: kick it up a notch with cross-validation instead of a single holdout set.


Step 7: Debug, analyze, and tune

If you want to know why your model is giving you rubbish performance, turn to advanced analytics on your holdout (debugging) dataset. That’s how you find inspiration for what to try next. The signal you’ll get here usually tells you to backtrack to engineer different features or prepare new algorithm packages to try running your data through.

Pro tip: tackle hyperparameter tuning in this step. “Hyperparameter” is to “algorithm” what “temperature dial” is to “toaster.” Don’t worry too much about that dial the first time you try toasting bread, but once you’re sure the toaster is definitely for you, do invest some time in fiddling with that dial.


Step 8: Validate your models

While you may do whatever you like with your debugging data, you’re not allowed to poke around in your validation dataset because doing so erodes its trustworthiness in your fight against overfitting. You’re only allowed to look at the performance metric. Think of validation as a safe space to get a feel for how the model’s brutal final exam would go… but with room for redemption in case you need to start over. Only move past the validation step when you’re sure that your candidate model is The One.

Pro tip: many ML/AI cowboys think they’re doing validation when they’re actually doing debugging. This bad practice skyrockets the probability of testing failure. That’s cute for school projects where nothing is on the line, but it’s painful when process ignorance torpedos your business project. Watch out for inexperienced engineers who don’t understand that doing debugging with validation data amounts to playing Russian Roulette.


  • Validation is the breakthrough in the history of data science that sparked the ML/AI revolution.
  • How not to be an AI idiot. (Validate properly!)

Step 9: Test your model

The moment of truth! Testing is where you find out whether your best is good enough on 100% pristine data. Because neither the engineers nor the model have ever seen this data, there is no way they could have cheated to jury-rig a solution that wins without generalizing to the real world. A statistical test of performance in these data is the cleanest, most trustworthy signal of quality you can get. The downside is that you can only use test data once. That’s why you use validation data as a dirty signal first.

If you pass testing, you’ll invest the engineering resources to build a live, production-worthy version of your prototype model. If you fail, it’s pens down.

Pro tip: failing testing means you abort your ML/AI project. No wheedling. No whining. No begging. All that fuss about proper debugging and validation was there to give you your shot at redemption, so shut up and accept the test results. The only exception to this rule is the obscenely privileged situation where it’s dirt cheap to collect more data. That allows you to continue your project with a brand new, unpolluted test dataset. The model that failed testing must die, however. Have the strength of character to kill it.


Step 10: Productionize your system

In this step, you’ll turn your prototype into a ML/AI system that has the ability to go live and play nicely with your production code. This might be as simple as writing the recipe on a napkin and using it to help you decide or as complicated as developing a scalable model with automated retraining capabilities that plays nicely with a behemoth codebase, has built-in safety nets, and is designed to withstand adversarial attacks. None of these latter joys were covered in the prototype training phase, so there’s plenty of work ahead.

Pro tip: your model probably won’t exist in isolation, so look for systems and processes with the potential to be affected by yours in surprising ways. Think carefully about their reliability and relevance. (In other words, if you’re about to build a very reliable bull in the middle of a china shop, consider some change management for the china shop.)


Step 11: Run live experiments to launch safely

Once you’ve made your model capable of running live, don’t let it out of the gate all at once. Ramp up gradually and do experiments to verify that turning it loose upon the world is a good idea. If the experiment tells you to leave it locked up, that’s what you should do. (We’ve all seen that horror movie.)

Nervous that all your work is about to go to waste? You should be. You’ve sunk in enough effort to be in love with your project by now and Steps 9 & 11 exist to crush your dreams. Good. Now you’ll be more careful in the preceding steps.

We wouldn’t want your unfettered parental feelings towards your ML/AI system to foist some dearly beloved poisonous rubbish on us. These hurdles are there to ensure that high quality standards are maintained.

Pro tip: you may need to first build infrastructure that allows you to run live statistical experiments, otherwise you won’t be able to launch safely. Part of this is writing code that allows you to randomize which cases are served by your ML/AI system and which are served by your next best alternative (which might be manual).


Step 12: Monitor and maintain… forever

After you’ve launched, you can’t just step away and leave your system to its own devices. You’ll need to keep putting in effort to keep it safe and reliable as time moves forward and the universe changes. It’s the gift that keeps giving (more work).

A good start is having analytics for system monitoring plus a maintenance plan, which includes an even better standard of documentation and the headcount to keep this thing reliable over its lifespan.

Pro tip: if you build a massive production ML/AI system, don’t make the rookie mistake of failing to hire analysts to monitor for input nonstationarity and other surprises.


(Coming soon.)

The infographic version for those who love ‘em. Open in a new tab to zoom.

There’s a lot more to machine learning and AI than a bunch of algorithms.


I hope you can see that there’s a lot more to machine learning than a bunch of mathematical algorithms, so don’t be fooled by courses that only teach the algorithm stuff. The art of applying AI to solve business problems boils down to:

Step 0–1 Asking the right questions

Step 2–4 Getting and preparing useful data

Step 5–7 Finding patterns in disposable data

Step 8–9 Checking that the patterns work on new data

Step 10 Building a production-ready system

Step 11 Making sure that launching is a good idea

Step 12 Keeping a production ML system reliable over time

Anxious to dive deeper into these topics? I’ve got your back! This list also happens to be a table of contents for the deep dive blog topics I’ll be prioritizing for my writing in 2020 (otherwise known as Hindsight Year). The more you share my blog posts with your friends, the more time I devote to writing new ones to bring you new chapters sooner. Stay tuned!

Thanks for reading! How about an AI course?

If you had fun here and you’re looking for an applied AI course designed to be fun for beginners and experts alike, here’s one I made for your amusement:

Enjoy the entire course playlist here:

Liked the author? Connect with Cassie Kozyrkov

Let’s be friends! You can find me on Twitter, YouTube, Substack, and LinkedIn. Interested in having me speak at your event? Use this form to get in touch.



Cassie Kozyrkov
The Startup

Chief Decision Scientist, Google. ❤️ Stats, ML/AI, data, puns, art, theatre, decision science. All views are my own.