Deploying Machine Learning Models on Node.js

Ando Saabas, Senior Data Scientist at Bolt

Published in

Bolt Labs

5 min readOct 19, 2018

At Taxify, we connect more than 15 million customers with over 500 thousand drivers, which translates to millions of rides every week. Doing this with an engineering team of ~70 people means that we need to keep our technology stack as lean and streamlined as possible.

Our backend technology stack is based on microservices and built on top of Node.js. A great benefit of this unified approach is that almost all of our excellent tooling and infrastructure — for testing, deployment, and monitoring — can be seamlessly shared and reused between different teams and services.

At the same time, we utilize machine learning throughout our products, for example for driver’s arrival time estimation, dispatching, pricing, and pickup/dropoff suggestions. Most data science and ML tools are built on Python or R stacks. We are no different — in our team, data analysis and model building is done mainly using Python and Jupyter notebooks. However, if we were to deploy the mission critical models in Python environment, it would mean that a lot of tooling that already exists for us in Node.js would have to be recreated and maintained for the new services.

To avoid the overhead of supporting multiple sets of tooling, we have taken the approach to deploy machine learning models as Javascript libraries in Node.js. So while all the actual data science and model training is done in Python, the final models are automatically translated to Javascript and deployed as standard Node.js libraries.

In the next sections, we’ll go through how we have achieved the automatic model translation and deployment.

Translating ML models to Javascript

There is a wide range of ML models that one can utilize depending on the problem at hand. For some types of models, translation to Javascript is straightforward. For example linear and logistic regression models can trivially be translated into Javascript functions. For neural network models, Tensorflow has Javascript and Node.js support.

Among the most successful and widely used machine learning libraries are XGBoost and LightGBM (as witnessed by the number of Kaggle competition wins achieved using them) which implement gradient boosted models. We use gradient boosted models extensively at Taxify; unfortunately, these libraries do not have built-in support for Javascript.

Translating Gradient Boosted Trees

At their core, gradient boosted (tree) models are sets of decision trees, applied sequentially to input data. A decision tree in turn can be thought of as a nested if-then-else statement. This means that the internal data structure underlying a learned gradient boosted decision tree model can be translated into code as if-then-else statements.

In the following table, we have an example of a data structure specifying a decision tree in LightGBM: its features, corresponding thresholds, indices of node children and finally the leaf values.

A straightforward translation of the structure into valid Javascript would be the following, assuming that “data” is an array of input feature values:

Thus, as the final outcome of the translation, we have a Javascript function that takes as input a feature vector, passes it through the set of if-then-else statements, and returns the final result as the model output.

For helping with this type of model-to-code translation, we utilize the excellent Treelite library. While it does not support Javascript directly (it supports generation of C and Java code), we use it as backend for our Javascript translator.

Automation and testing

Translating models into Javascript is a fully automatic process. In addition, we automatically create tests to check the validity of the translation. After translating the models, we generate predictions from both the original model and translated Javascript code using hold out data. We then check if all predictions in the original and translated models match, to catch any potential errors that could arise, for example from differing floating point accuracy. If the tests pass, the generated models can be pushed to the model repository using npm.

Performance

Translation can easily handle very large models with thousands of trees. Generated code has in general excellent performance, thanks to the V8 optimizing just-in-time compiler that underlies Node.js. In our experience, in terms of speed, the generated code matches or exceeds calling the GBT library function directly.

In the following chart, we plot the performance for a 1000 tree model, translated from a LightGBM model. The translated model achieves about 60% higher throughput when compared to calling the original model via the Python API, at about 10,000 predictions per second on a single i7 core.

Summary

Running ML models — and gradient boosted trees in particular — on Node.js is a viable path for deploying machine learning models in production for real-time services. This approach offers several benefits:

We can utilize existing deployment and monitoring tools that have been built for the rest of the technology stack.
We can use existing tools and team know-how for debugging and profiling the models, since they are simply Javascript libraries.
If necessary due to performance reasons, the models can be close to the rest of production code (i.e. used as library imports directly by the consuming service), removing the need of the overhead of network calls.

About the Author

Ando Saabas is a senior data scientist at Taxify. He is involved in most of Taxify’s ML based efforts including the ETA engine, dispatching and pricing optimization, and simulation, as well as helping develop Taxify’s ML and data science infrastructure.

Prior to Taxify, Ando spent 9 years at Skype and Microsoft as researcher and applied scientist, working mostly on call quality and reliability in Skype and Microsoft Teams.