A bit more than a year ago I wrote a post titled “How APIs are Eating the Product Stack“ which illustrates how plenty of developer tools and APIs are emerging at every layer of the product stack.This new post is an addition where I specifically cover how the Machine Learning (ML in the rest of the post) component fits into this trend.
First of all, like many people, I don’t think that every company will become an “ML first” company. I’m aligned with this analogy taken from Ryan’s excellent post:
“In the end, as this technology gets more approachable and robust, the number of companies using machine learning in some aspect of their business will far exceed the number of ‘machine learning companies’. Developing and releasing a mobile application didn’t make you a ‘mobile’ company, it was either a better way to solve an existing problem (hailing a ride) or enabling a new set of product experiences. If it isn’t either of those, users and customers won’t care for long.”
This is also why I think the current development of ML tools fits the broader trend of specialized dev tools “eating” the product stack.
Let’s explore it in more details.
The ML Sandwich
To understand how ML will get democratized through developer tools, I urge you to read first this exellent post, The evolution of machine learning, by Catherine Dong where she explains what the ML sandwich looks like:
The ML stack relies on three closely tight layers:
- The data layer: proprietary, shared or public data that you will use to feed your ML model.
- The model layer: the ML algorithm that builds predictions based on your input.
- The deployment & monitoring layer: the integration of the results in your product.
Each layer has its own set of challenges, which are covered in Catherine’s post, this is why ML projects can range from very simple implementations with small datasets and standard algorithms, to state of the art projects using neural networks engines with massive datasets.
An ML first business needs to focus the entire company around that stack by:
- Collecting data which is, ideally, unique and of the highest quality possible that competitors cannot replicate.
- Creating a highly customized and efficient ML model.
- Using the results to offer a solution 10X better or 10X cheaper than the existing ones.
The vast majority of tech companies won’t have the need, nor the luxury, to focus their entire engineering effort around this ML sandwich. These “non-ML first” companies will use third-party developer tools and APIs to implement ML functionalities to enhance their current product on specific aspects.
In that perspective we’re seeing the emergence of two breeds of developer tools and APIs:
- Vertically integrated ML tools.
- Horizontal ML tools.
Vertically integrated tools
Vertically integrated tools tackle the whole ML sandwich (from data to deployment) to offer it “as a service” to their users. The user sends its data to the tool which processes it through the whole stack: from data preparation to generating the predictions thanks to their internal ML model and various integration possibilities to consume the result (API, integrations with third-party apps).
So far I saw two approaches when it comes to vertically integrated ML tools:
- By type of data: the tools are specialized in a specific type of data like Clarifai for images, Monkeylearn for text, Twentybn for videos, etc.
- By use case or industry: customer support for call centers (CallDesk, a Point Nine company), Qloo for the entertainment and culture industries, etc.
The first obvious benefits for users are the ease of use and the speed of implementation. They don’t need to build everything from scratch (data, ML model, and deployment) and can benefit from great results without having full-time employees maintaining an internal tool.
Another major benefit is the access to pooled data. The size and quality of the dataset used to train a machine learning engine are critical for the performance of the tool. Most users can only use their own “limited” dataset as input, whereas services like Clarifai can focus on training their ML model with much bigger datasets and by improving it with each additional user: each new user brings a small additional amount of data that will increase their existing dataset. So a user with a small dataset can still benefit from great performances.
The first issue can be the lack of customization offered by these tools. If you have specific needs, it can be challenging to customize anything from data processing to the ML model or the integration you need to deploy the results.
A second potential drawback is a limitation that every “laser focus” tool faces: you cannot use it for other use cases or on different types of data. Hence if you need an ML tool that works across a variety of datasets an all-around approach might be better.
Horizontal ML tools
Horizontal tools don’t tackle the whole stack but focus on one layer of the ML sandwich.
- Data layer: Dataiku to clean data, Scaleapi to label data, Pandascore to access data (you’ll find many data providers in plenty industries).
- ML model layer: Tensorflow.
The major benefit of these tools is customization. Developers can create their ML stack by choosing the tools they want and customize them according to their needs and preferences.
The second benefit is the growing ecosystem of third party apps around the most important ML platforms such as Tensorflow or Amazon AI. As these tools attract more users, we’ll see more third-party tools built on top of them and create a real ecosystem of plugins and add-ons. We’re not there yet, but it will eventually come, and it will be one of the huge advantages of using such platforms.
A big question is whether some of the vertical tools mentioned above have the potential to become “platforms” as well.
The main problem with this approach is, obviously, the amount of resources it takes to build your own ML stack (talents, knowledge or tool costs). The second limitation comes from the “data” layer. You can create the best ML engine possible, if you don’t have access to a big enough dataset, your results won’t be as good as what some vertically integrated tools can provide.
Build versus Buy
The “build versus buy” question is currently a very tricky one when it comes to internal ML projects for two reasons:
- The ecosystem of ML tools is still very young / not mature enough.
- The community is not “bored” yet.
A young ecosystem.
I still remember when I first entered the startup world, back in 2007, many companies had their own physical servers installed somewhere in their office. The Cloud infrastructure ecosystem was already emerging (AWS existed) but there were far fewer tools, and they were not as easy to use and powerful as today. As a consequence using third-party tools to host your website was not a no-brainer yet.
I think we’re in the same situation when it comes to ML; there are not enough “established” winners and many of the products available need to get stronger and more user friendly. Until we reach this critical mass of “easy to use” developer tools, choosing to buy rather than to build will be tricky.
The community is not “bored” yet.
When I discuss this topic with some CTOs from our portfolio, most of them explain that if you ask their developers plenty of them would be excited to build a ML project internally. For many developers to tinker with ML is something new and exciting, so they’ll naturally lean toward the “build” choice. Once this “excitement” phase is over and that most developers have gone through / witnessed failed internal ML projects, the choice of “buy” will become more acceptable.