ML-Infrastructure: Build vs. Buy vs. Open-Source

Notes from TWIMLCon’s Unconference session

Manasi Vartak

Published in

Verta

3 min readOct 25, 2019

Note: That this post is an abbreviated version of the long-form post that appeared here.

Most teams finding themselves in need of ML Infrastructure start by looking at reference implementations of ML Platforms from large tech companies like Uber (Michelangelo) and Airbnb (BigHead). However, since none of these implementations are open-source and one size does not fit all, all teams must decide whether to build, buy, or piece together open-source components of their infrastructure.

At the inaugural TWIMLCon in San Francisco, I led an unconference session focused on this very topic: whether to build ML infrastructure in-house, whether to buy it, or whether to just leverage open-source. Here is a summary of the notes from this session.

Why (not) use Open-Source ML Infrastructure?

The general consensus of the attendees was that current open-source offerings for ML Infrastructure, though progressing rapidly, are not mature enough for most teams (e.g., unlike MySQL for databases or AirFlow for data pipelines.) Either they require an immense amount of setup and hacking or are inadequate in the functionality they offer.

When to Build ML Infrastructure In-House?

Building in-house makes sense if your application has a very specialized use case (e.g., prediction latency must be <20 ms, or your models must work with a legacy model serving system.) or your setup requires a lot of customization.

Why Buy ML Infrastructure?

The strongest reason to buy an ML Platform vs. building in-house is that building in-house represents an opportunity cost. If building infrastructure is not going to help you differentiate in your business (a low leverage activity), don’t build in-house.

How do I run a process to Buy ML Infrastructure?

First, know what you are looking for. This was probably the point most highlighted by multiple attendees. For instance, have answers to the questions such as what cloud vendor do you use, what languages do you need to support, etc.
Once you know your requirements, follow a principled and tight process including demos, POCs, and technical deep dives. Full list of questions and process checklist in the long-form article.

Advice for Vendors of ML Infrastructure

The biggest and rather surprising advice that was provided was: push your customers on their needs first. Lots of times, the customer is looking to better understand the landscape of tools out there and see what might match their needs.
Vendors are over-promising and under-delivering. If you can do the opposite, you will stand out!

Thank you to TWIMLCon for hosting this Unconference session and to everyone who attended. Check out the long-form blog post and if I missed anything or if you have questions, please reach out at manasi@verta.ai.

About Manasi:

Manasi Vartak is the founder and CEO of Verta.ai, an MIT-spinoff building software to enable production machine learning. Verta grew out of Manasi’s Ph.D. work at MIT CSAIL on ModelDB. Manasi previously worked on deep learning as part of the feed-ranking team at Twitter and dynamic ad-targeting at Google.

About Verta:

Verta.ai builds software for the full ML model lifecycle including model versioning, model deployment and monitoring, all tied together with collaboration capabilities so AI & ML teams can move fast.