Is your AI project a nonstarter?

Here’s a reality check(list) to help you avoid the pain of learning the hard way

Cassie Kozyrkov
Oct 26, 2018 · 4 min read

If you’re about to dive into a machine learning or AI project, here’s a checklist for you to cover before you dive into algorithms, data, and engineering. Think of it as your friendly consultant-in-a-box.

Don’t waste your time on AI for AI’s sake. Be motivated by what it will do for you, not by how sci-fi it sounds.

This is a super-short version of my 18 minute monster Ultimate Guide to Starting AI. If you’re about to embark on ML/AI, here’s hoping you can answer “yes” to all of these questions.

If you answer “no” to any of the checklist questions, this might be a portrait of your project.

Step 1 of ML/AI in 22 parts: Outputs, objectives, and feasibility

  1. Correct delegation: Does the person running your project and completing this checklist really understand your business? Delegate decision-making to the business-savvy person, not the garden-variety algorithms nerd.
  2. Output-focused ideation: Can you explain what your system’s outputs will be and why they’re worth having? Focus first on what you’re making, not how you’re making it; don’t confuse the end with the means.
  3. Source of inspiration: Have you at least considered data-mining as an approach for getting inspired about potential use cases? Though not mandatory, it can help you find a good direction.
  4. Appropriate task for ML/AI: Are you automating many decisions/labels? Where you can’t just look the answer up perfectly each time? Answering “no” is a fairly loud sign that ML/AI is not for you.
  5. UX perspective: Can you articulate who your intended users are? How will they use your outputs? You’ll suffer from shoddy design if you’re not thinking about your users early.
  6. Ethical development: Have you thought about all the humans your creation might impact? This is especially important for all technologies with the potential to scale rapidly.
  7. Reasonable expectations: Do you understand that your system might be excellent, but it will not be flawless? Can you live with the occasional mistake? Have you thought about what this means from an ethics standpoint?
  8. Possible in production: Regardless of where those decisions/labels come from, will you be able to serve them in production? Can you muster the engineering resources to do it at the scale you’re anticipating?
  9. Data to learn from: Do potentially useful inputs exist? Can you gain access to them? (It’s okay if the data don’t exist yet as long as you have a plan to get them soon.)
  10. Enough examples: Have you asked a statistician or machine learning engineer whether the amount of data you have is enough to learn from? Enough isn’t measured in bytes, so grab a coffee with someone whose intuition is well-trained and run it by them.
  11. Computers: Do you have access to enough processing power to handle your dataset size? (Cloud technologies make this is an automatic yes for anyone who’s open to considering using them.)
  12. Team: Are you confident you can assemble a team with the necessary skills?
  13. Ground truth: Unless you’re after unsupervised learning, do you have access to outputs? If not, can you pay humans to make them for you by performing the task over and over?
  14. Logging sanity: It’s possible to tell which input goes with which output, right?
  15. Logging quality: Do you trust that the dataset actually is what its purveyors claim it is? (To learn from examples, you need good examples to learn from.)
  16. Indifference curves: Since your system will make mistakes, have you considered how much worse one type of mistake is relative to another?
  17. Simulation: Have you considered working with an expert in simulation to help you visualize what you’re asking for? Not mandatory, but useful.
  18. Metric creation: Have you stitched the scoring of individual outputs into a metric for the business performance of your system over many instances?
  19. Metric review: Has your business performance metric been reviewed to ensure that it’s not possible to get a good score on it in some perverse and harmful way?
  20. Metric-loss comparison: (Optional.) Does your business performance metric correlate well with a standard loss function? If not, what you’re asking for might be very difficult.
  21. Population: Have you thought carefully about which instances you need your system to work for? The statistical population of interest defines which broad collection of instances your system’s performance tests will cover.
  22. Minimum performance: Have you defined a strict minimum performance criterion for testing and committed to crushing your system if it doesn’t make this bar?

Once you’ve answered “yes” to all that, you’re ready to move to the next step of ML/AI, which involves data and hardware (and engineers, yay!). I’ll be putting out a guide on that soon.

If that was too short of a short summary, the full guide to starting an AI project is here. Enjoy!


Elijah McClain, George Floyd, Eric Garner, Breonna Taylor, Ahmaud Arbery, Michael Brown, Oscar Grant, Atatiana Jefferson, Tamir Rice, Bettie Jones, Botham Jean

Cassie Kozyrkov

Written by

Head of Decision Intelligence, Google. ❤️ Stats, ML/AI, data, puns, art, theatre, decision science. All views are my own.

Elijah McClain, George Floyd, Eric Garner, Breonna Taylor, Ahmaud Arbery, Michael Brown, Oscar Grant, Atatiana Jefferson, Tamir Rice, Bettie Jones, Botham Jean

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store