Using TensorFlow.js for business predictions while maintaining privacy

Image for post
Image for post
iERP.ai Studio platform

How iERP.ai was born

As a project manager working for 15 years for Forture500 companies implementing asset management, big data solutions, and being an active developer playing with code for the past 30 years, I saw the significant business impact that data has to company success.That sparked an idea to enable Small & Medium size companies to have the same data-driven businesses, but without long onboarding times & large implementation projects.

How we see data privacy in business predictive analytics.

We are thinking about data privacy from a data sharing perspective. By sharing less data with 3rd parties can help to reduce risks related to data breaches beyond customer control. Furthermore, small and micro-size businesses would prefer ease of use and reduced costs above all else.

Choosing tensorflow.js

We choose Tensorflow.js as our machine learning library to build our 2 products, on-premise as well as cloud offering plus having one team of javascript developers working on both products.

  • Have developers working in Python on proofs-of-concept which then canbe easily migrated to Node.js / Electron environments
  • Have access to a very large community of other developers familiar with
  • TensorFlow whilst having the reach and scale benefits of JavaScript

iERP.ai on-premise app (Tensorflow.js + ElectronJS)

Image for post
Image for post
iERP.ai Studio platform

iERP.ai cloud (TensorFlow.js + NodeJS)

We are using Tensorflow.js with NodeJS packaged with docker as the backbone for our cloud offering. NodeJS version enables to run training & forecasting in the same way triggered from command line interface, docker and executed in the cloud.

Lessons learned

We had a number of learnings over the past 2 years we would love to share with you and how we got past various challenges in that time to make our product work the way we needed using TensorFlow.js in these environments which we have summarized below.

Image for post
Image for post
Heap usage comparison in node.js
Image for post
Image for post
Tensorflow.js running in worker thread
  • UI thread: Responsible for UI using React
  • TFJS worker thread: Training and forecasting execution via tfjs-node.
// index.jsconst { Worker } = require('worker_threads')function runService(workerData) {
return new Promise((resolve, reject) => {
const worker = new Worker('./worker-thread.js', { workerData });
worker.on('message', (data) => {
switch (data.type) {
case 'epochUpdate':
console.log('Epoch:', data.epochNumber, 'Loss:', data.loss);
break;
case 'trainingCompleted':
resolve();
break;

default:
// other behaviour
}
});
worker.on('error', reject);
worker.on('exit', (code) => {
if (code !== 0)
reject(new Error(`Worker stopped with exit code ${code}`));
});
});
}
function generateTrainingData() {
//Here you will prepare your training data
return [[1, 2], [2, 3], [3, 4], [5, 6], [7, 8]];
}
function generateTestData() {
//Here you will prepare your testing data
return [3, 5, 7, 11, 15];
}
async function run() {
const trainingData = generateTrainingData();
const testData = generateTestData();
const result = await runService({
type: 'training',
trainingData: trainingData,
testData: testData,
});
}
run().catch((err) => console.error(err))
//worker.jsconst tf = require('@tensorflow/tfjs-node');
const { parentPort, workerData } = require('worker_threads');
if (workerData.type === 'training') {
const xs = tf.tensor2d(workerData.trainingData, [workerData.trainingData.length, 2]);
const ys = tf.tensor1d(workerData.testData);
const epochs = 10;
const batchSize = 1;
const model = tf.sequential();
model.add(tf.layers.dense({ units: 3, inputShape: 2 }));
model.add(tf.layers.dense({ units: 1 }));
model.compile({ optimizer: 'adam', loss: 'meanSquaredError' });
model.fit(xs, ys, {
epochs,
batchSize,
verbose: 0, callbacks: {
onEpochEnd: async (epochNumber, loss) => {
parentPort.postMessage({ type: 'epochUpdate', epochNumber: epochNumber, loss: loss.loss });
},
}
}).finally(() => {
parentPort.postMessage({ type: 'trainingCompleted' });
});
}
Image for post
Image for post
Our Tensorflow.js custom data pipeline
  1. For a typical algorithm, we are loading 5–6 separate tables of data which are then cross-validated and used for training and forecasting.
  2. For each algorithm, we develop a set of SQL queries, which are verifying data integrity, generating helping fields, and splitting records into 2 lists (training and testing data sets). This approach is efficient, enabling complex queries to be executed in tens of milliseconds, with the side benefit that those SQL queries are portable and can be used in the cloud as well if desired.
  3. As all data is available in SQLite3 it enables us to create business intelligence reports for end-users using SQL. That shortened our development time and cut complexity.
  4. When the time comes to execute training, we do an iterative training approach where a subset of data is exported from SQLite3 into node arrays based on which tensors are created.
Image for post
Image for post
Usage of AI algorithms distributed using npm packages

iERP.AI Algorithms

After 2 years of development, we have a portfolio of 4 business offerings which we call “Algorithms” that are implemented into our Studio product and soon available in the cloud too.

Image for post
Image for post
iERP.ai Studio algorithm selection
  • Next best offer: Forecasting what product or service each consumer will purchase next. Historical customer behavior and correlated events are used during data processing for training and forecasting. You can read more about our approach here.
  • Discount recommendation: Helping customers to personalize discounts for individual consumers & products. Historical customer behavior data used in a deep, fully connected network, combined with a large set of rules which can then be used to recommend optimal discounts for each customer and product separately. You can read more about our approach here.
  • Debt aging: Helping customers estimate the probability of their customer not paying an invoice. Historical customer behavior data and customer segmentation is used by a custom network with RNN part of the network for historical events and fully connected network with customer segmentation information.
Image for post
Image for post
iERP.ai Studio training

Conclusions:

Each business problem is posing a different set of challenges. TensorFlow.js helps us to handle all of these situations. We love the power of accelerated machine learning, the strong JavaScript ecosystem, and code reuse across all platforms.

Acknowledgments:

We would like to thank Google, TensorFlow.js, and the whole community for their hard work on this incredible platform enabling innovation beyond expectations.

Written by

CEO & Co-founder of iERP.ai, leader, entrepreneur, technology & AI enthusiast … (More: https://www.linkedin.com/in/jozefbalaz/)

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store