How to run Tensorflow.js on a serverless platform : deploying models

Dominique D'Inverno
4 min readMar 25, 2020


This is the last part of a 3 articles serie.

In the first part, we introduced neural networks and TensorFlow framework basics.

In the second part, we explained how to convert existing models from Python to TensorFlow.js

Finally, we present, through an example, how to use an online TensorFlow.js model and deploy it rapidly using our WarpJS JavaScript Serverless Function-as-a-Service (FaaS).

Using an online model with TensorFlow.js

Many public models can be retrieved from web databases.

We’ll use the “toxicity” pre-trained model in the next sections as an example.

The toxicity model detects whether text contains toxic content such as threatening language, insults, obscenities, identity-based hate, or sexually explicit language. The model was trained on the civil comments’ dataset: which contains ~2 million comments labeled for toxicity. The model is built on top of the Universal Sentence Encoder (Cer et al., 2018).

Browser-based usage:

The model can be directly loaded for use in JavaScript at:

In the html, add:

<script src=”"></script><script src=”"></script>

Then, in the JS code:

// sets the minimum prediction confidence
const threshold = 0.9
// load and init the model
const model = await toxicity.load(threshold);
. . .// apply an inference
const predictions = await model.classify(inputText);
. . .

Node-based usage:

Toxicity is also available as a NPM module for Node.js (package that actually loads the model from the storage link above):

$ npm install @tensorflow-models/toxicity

Then, in the JS code:

const toxicity = require(‘@tensorflow-models/toxicity);// sets the minimum prediction confidence
const threshold = 0.9 // sets the minimum prediction confidence
// load and init the model
const model = await toxicity.load(threshold);
. . .// apply an inference
const predictions = await model.classify(inputText);
. . .

Deploying a model with WarpJS

As discussed before, inference on big data sets in the browser comes rapidly short in terms of performance due to model and data loading time as well as computing capabilities (even with accelerated backends).

Node.js allows to push further the performance limit by deploying on a high performance GPU engine and in the network neighborhood of the dataset, but the user will face complexity when trying to address distributed processing for the next performance step.

The WarpJS JavaScript FaaS enables easy serverless process distribution with very little development effort.

Example: toxicity model serverless deployment

WarpJS installation guidelines can be found here: Getting started with WarpJS

You can request WarpJS account here.

This article also provides a good tutorial on all steps to operate WarpJS.

In our WarpJS serverless operation, the browser acts as the primary input/output interface, through an index.html file.

It contains a text box to submit the input text to be analyzed and a “classify” button triggering the inference process.

<!DOCTYPE html>
. . .
. . .
<h1>TensorFlow.js toxicity demo with WarpJS</h1>
<form id=”form”>
<input id=”classifyNewTextInput” placeholder=”i.e. ‘you suck’” required>
<p id=”result”></p>

WarpJS is a function-as-a-service platform for JavaScript.

Instead of creating HTTP endpoints and use HTTP calls to do a remote inference, we just have to build a client for this inference function, deploy it on its FaaS, import it in the main application (via import statement in index.js or via script tag in html) and then call it like any JavaScript function.

So in our case, the classify function will be run on the WarpJS FaaS.

index.js (code to be deployed on WarpJS FaaS, prediction function is highlighted in bold) :

// Server initialization
const toxicity = require(‘@tensorflow-models/toxicity’)// The minimum prediction confidence
const threshold = 0.9
// Load the model
let modelLoaded = false
let model = null
toxicity.load(threshold).then(tsModel => {
model = tsModel
modelLoaded = true
// Force waiting for the async TensorFlow model load.
// The “deasync” lib turns async function into sync via JS wrapper of Node event loop.
// The “loopWhile” function will wait for the condition resolution to continue.
require(‘deasync’).loopWhile(() => !modelLoaded)
// Prediction functionconst classify = async inputs => {
// predict with tensorflow model
const predictions = await model.classify(inputs)
// check toxicity results
const toxic = predictions.some(({ results }) =>
results[0].match !== false)
return toxic

module.exports = { classify }

index.js (main, browser-based application):

* Copyright 2020 ScaleDynamics SAS. All rights reserved.
* Licensed under the MIT license.
‘use strict’// import WarpJS engine module
import engine from ‘@warpjs/engine’
// import deployed inference
import { classify } from 'warp-server';
// on submit form
event.document.getElementById(‘form’).addEventListener(‘submit’, async event => {
result.innerHTML = ‘<h2>Remote inference running</h2>’
// scan textbox
const text = classifyNewTextInput.value
// invoke inference
const toxic = await classify([text])
// render result
if (toxic) {
result.innerHTML = `<h2 style=”color:red”>Your sentence is TOXIC :(</h2> <img src=”/img/Pdown.png” alt=””>`
} else {
result.innerHTML = `
<h2 style=”color:green”>Your sentence is NON TOXIC :)</h2>
<img src=”/img/Pup.png” alt=””>`

Deploying to the WarpJS FaaS is straightforward, just use “npm run deploy” to get the url of the deployed site and start playing with TensorFlow.js.

Feel free to access url to see the demo in action.



About the author

Dominique d’Inverno holds a MSC in telecommunications engineering. After 20 years of experience including embedded electronics design, mobile computing systems architecture and mathematical modeling, he joined ScaleDynamics team in 2018 as AI and algorithm development engineer.



Dominique D'Inverno

Algorithm development and AI engineer, working in Javascript, scilab, python environments and tensorflow.