Our Sibyl API can serve AI predictions with a 50 millisecond response time. A developer calling the Sibyl API only has to specify an object ID (e.g., give me the prediction for person 0989k3dd84) instead of specifying all of the model’s input parameters. Sibyl supports many predictors with many versions each, and it offers data scientists a robust workflow for model deployment.
Creating and productionizing a useful AI system involves at least three stages:
- Prepare the data that the AI will learn from. This is more complicated than merely pointing an AI system at your database and will likely involve a Data Scientist carefully processing the data to maximize the AI’s ability to learn from it. This is a very interesting and subtle process, but its details are out of scope for this article.
- Build, train, and evaluate the model. This will likely involve a Data Scientist testing different algorithms and different parameters to those algorithms, and will in fact will likely be an iterative process interacting with step 1. This is, again, a fascinating topic, but it’s the next step that Sibyl focuses on and which we’ll describe in this article.
- Ultimately, a Data Scientist has produced a useful AI model with strong predictive power; now you need to deploy that model available for use at scale within your production systems. This is what Sibyl is for and what we’ll discuss for the rest of this article.
(What is AI?)
The Oxford English dictionary defines Artificial Intelligence (AI) as “the theory and development of computer systems able to perform tasks that normally require human intelligence, such as visual perception, speech recognition, decision-making, and translation between languages”.
Artificial Intelligence is an extremely broad topic encompassing many potential approaches and applications. The one we’re interested in for the purposes of this article is what’s known as Machine Learning, which uses mathematical techniques to analyze a dataset to “learn” from it and construct a model that can be used to make predictions based on other, similar data, without being specifically programmed to do so.
Sibyl Design Goals and Successes
When discussing Machine Learning, the term “model” is used in a few distinct and context-dependent senses, so to avoid confusion we’ll informally define a few terms we’ll use below:
- A “model” is the actual Python object that was developed and trained by a Data Scientist and exposes a “predict” method.
- A “predictor instance” is the model plus related metadata and artifacts, including the preprocessor function, version/provenance information, and the data on which was trained.
- A “predictor” is defined by a particular topic and may be implemented by one or more predictor versions.
So, for example, you may have an “LTV” predictor implemented by two different “predictor instances” which accept different inputs, utilize different algorithms, and were trained on different data.
All right, back to Sibyl. When we started designing Sibyl, we had a few high-level goals in mind:
- A Sibyl deployment should support many predictors, and in fact many versions of each predictor.
- Clients should be served by the newest version of the predictor by default, but should optionally be able to request a specific version.
- Deploying a new predictor and/or predictor instance to production should not require updating or redeploying Sibyl itself, only uploading artifacts (the preprocessor function and the model itself) and metadata.
- The model and preprocessor should be deployed in exactly the state used and prepared by the Data Scientist, to avoid duplicating work or introducing bugs.
- In the most common case, requesting predictions about an existing entity within the Ro platform (e.g. a member or a treatment plan) the API should be about as fast as high-performance marketing APIs, around 50 milliseconds (0.05 seconds).
- When requesting a prediction in that common case, the client should be able to identify the object of interest via an identifier (e.g. a UUID used to identify it within the Ro backend) rather than assembling and calculating the model parameters itself.
- The production system should auto-scale to support a wide range of potential workloads without manual intervention.
Why Build In-House?
We are, of course, hardly the first organization to deploy AI, and there are multiple products available that aim to make it nearly push-button — think offerings like DataRobot or Amazon Sagemaker. So why did we decide to build our own system?
In part, because we can do so more cost-effectively, especially given the high demand and usage we expect for this system. More importantly, because the off-the-shelf systems wouldn’t meet some of the goals we described above.
For instance, an off-the-shelf system might not support versioning models: if you upload a new version, the old one is no longer available. You can work around this via naming (put the version in the model name), but then clients who do want the new version have to update their code to explicitly request the new version and adapt to any changes in the data required.
Speaking of which, an off-the-shelf system will require you to pass the parameters exactly as the model expects them, which may include tens or even hundreds of values. This makes integration significantly more complicated than if the client can simply pass an identifier. You can again work around this, for example by exposing an API that returns the parameters for a given object. However, this adds some latency (your client is now making two requests, one to get the parameters and one to call the model) and at this point running the model yourself isn’t going to add much complexity. Even if your in-house service only calculates the parameters, you still have to handle scaling, versioning (whether via names or a separate concept), and so on.
In future articles we’ll dig into more technical aspects of Sibyl’s implementation and our overall workflow for productionizing Machine Learning models.
We’ll also talk about how Sibyl fits into a robust Data Science workflow, which is something we’re passionate about — Data Scientists should have supporting automation that makes it easy to maintain a good workflow and automates as many of the bookkeeping, provenance tracking, and deployment tasks as possible.