Before You Can Have a Machine Learning Strategy, You Need a Data Strategy

As we write the book Machine Learning in Practice: How Business Leaders use Data to Reduce Costs, Increase Efficiency, and Achieve Breakthroughs (coming early in 2019), we’ll be posting draft excerpts right here.

Let us know what you think, give us a clap down below if you like what you read, and follow @InfiniaML and @RobbieAllen on Twitter for the latest updates!

CEOs across the globe are being asked to pull together a machine learning or artificial intelligence strategy for their companies. How is ML being used to automate processes or deliver new capabilities? What is the competition doing with ML and what algorithms are relevant?

The attention around AI/ML is mostly driven out of the intense amount of hype that you can hardly avoid in the tech press and mainstream media. It doesn’t take much time reading articles on TechCrunch or even the NYTimes to develop a complex that you are behind and the rest of the world is moving at lightning speed with all sorts of interesting automation use cases.

Unfortunately, you can’t really develop a strategy around machine learning until you develop a data strategy first.

It may not seem sexy, and if you’re reading this article, you are probably much more interested in fancy machine learning techniques or how to transform your business with deep learning. But there simply is no machine learning without data, and there is no machine learning success without good data. Garbage data that goes into a machine learning model results in garbage predictions out.

So what counts as good data? And how much data is enough to be useful for machine learning? There aren’t easily quantifiable guidelines, but in a coming post I’ll offer a framework to develop a data strategy so you are prepared to start evaluating machine learning.