How to build a high-level front-end machine learning platform based on tfjs-node

rk ca
imgcook
Published in
8 min readJan 7, 2020

Background

With the development of deep learning, intelligentization has begun to empower all walks of life. As the closest part to users, the front end developers also hope to significantly improve efficiency and reduce labor costs with the help of AI capabilities and create a better experience for users. Therefore, the front-end intelligentization is also regarded as the important direction of the front-end in the future.

However, on this road, there are several problems that have been hindering the development of intelligentization: 1. machine learning engineers who are familiar with machine learning do not have a sense of front-end business, nor do they understand the data accumulated by the front-end and the possible value of the data. Therefore, it is difficult for them to participate in this process. 2. Traditional front-end developers do not know much about languages commonly used in machine learning, such as python and c ++, so the cost of learning and conversion is very high. 3. Traditional front-end engineers do not understand the algorithms and principles of deep learning, which makes it difficult to directly use some existing machine learning frameworks (tensorflow, pytorch, etc.) for training.

To solve these problems and promote the development of front-end intelligence, we developed pipcook. Pipcook uses a frontend friendly JS environment and uses tfjs as the underlying algorithm capability. It wraps relevant algorithms for front-end business scenarios, allowing front-end engineers to quickly and easily use machine learning capabilities.

This article will mainly discuss how pipcook works with Tensorflow.js. It focuses on how to use the underlying model and computing power of tfjs-node to build a high-level machine learning pipeline. For pipcook, you can refer this article.

Why to use Tensorflow.js as the underlying algorithm framework

Tensorflow.js is a JS-based machine learning framework released by Google in 2018. Google has also make relavant codes open-sourced. Pipcook uses tfjs-node as the underlying dependency framework in data processing and model training, and develops the pipcook plug-in on Tensorflow. js and assembles it into a pipeline. We use Tensorflow.js for the following reasons:

  • Pipcook is focused on serving front-end engineers, so it is mainly developed in the JS language. Therefore, we prefer to use the JS computing framework to avoid performance loss and error risks caused by bridging other languages.
  • Compared with some other JS machine learning frameworks, Tensorflow itself is very popular in C ++ and Python. The JS version also uses the underlying capabilities of C ++ and many operators, supporting a large number of network layers, activating functions, optimizers, and other components, and provides good performance and GPU support.
  • Officially, tools such as tfjs-converter are provided to convert SavedModel or Keras models into JS models, allowing you to reuse many mature Python models.
  • JS is not mature in the ecosystem of mathematical computing. There is no scientific computing library like numpy, and some similar libraries are difficult to be seamlessly combined with other computing frameworks. Tfjs itself provides tensor encapsulation, equivalent to numpy arary capabilities, and can be directly passed into the tfjs model for training, and the performance is very good.
  • Tfjs provides the Dataset API, which can abstract data, encapsulate simple and efficient interfaces for data, and process batch data at the same time. The data flow method of the Dataset API can also be efficiently combined with that of the pipcook pipeline.

Use Tensorflow. js to process data

For machine learning, accessing and processing large amounts of data is a key issue. In traditional scenarios where the amount of data is not large, we can read the data into the memory at one time for operation. However, for deep learning, most of the time the data is far beyond the memory size. Therefore, we need the ability to access data from the data source segment by segment as needed. The Dataset API provided by tfjs wraps the processing ability of this scenario.

Data Process Flow

In a standard pipcook pipeline, we use the Dataset API to package and process the data. As shown in the figure, it is a typical process of data circulation.

  • At first, the raw training data is imported into the pipeline through the data-collect plug-in. The raw data may be some local files or data stored on the cloud. The plug-in reads the data into the pipeline.
  • After reading the data, the data-collect plug-in determines the data format and encapsulates it into a corresponding tensor.
  • The data-access plug-in is responsible for accessing the data and packaging tensor into tf.Dataset to facilitate batch data processing and training later.
  • In the data-process plug-in, we will do some specific processing on the data, including shuffle, augment and other operations. These operations will use operators such as map encapsulated by Dataset, batch processing is performed in real time in the data stream.
  • Then, in model load, data is read into the model in batch for training.

At this point, we can imagine a dataset as an iteratable set of training data, just like the Stream in Node.js. Whenever a request for the next element from a dataset is made, the internal implementation accesses the element as needed and performs the pre-set data processing operation function. This abstraction allows the model to easily train large amounts of data. When we have multiple datasets, datasets can also be easily shared and organized into the same set of abstractions.

Training model

Model Train Architecture

TensorFlow.js is divided into two groups of APIs: low-level and high-level. The low-level API is created by deeplearn.js derivation, including the operator (OP) needed to build the model, which is responsible for processing some low-order data operations such as linear algebra, to help us deal with the mathematical operations in machine learning. The high-level Layers API is used to wrap some commonly used machine learning algorithms and allow us to load trained models, such as models learned by Keras.

Pipcook uses plug-ins to develop and run models. Each model load plug-in with a specific model. Most models are implemented based on tfjs. Tfjs-node also provides gpu acceleration and other functions to improve model training. Of course, due to the ecology and other reasons at this stage, the implementation cost of some specific models in tfjs is relatively high. For this issue, pipcook also provides python bridging and other methods, you can directly Call python in the JS runtime environment for training. The details of this section of bridging will be described in detail in the following chapters.

Deployment

For an industrial machine learning pipeline, after training the model, you need to deploy your model in a way that allows the model to serve the real business. Currently, pipcook is deployed through model-deploy plugins.

  • Quick validation solution: in many cases, you may want to perform quick experiments on your data and models, such as using small batches of data and a small amount of data. In this scenario, we do not need to deploy the model to remote for verification. For this, pipcook has a built-in plug-in deployed locally. After the training on the machine is completed, pipcook starts a prediction server locally for prediction services.
  • Server docker image: pipcook provides an official image that contains the necessary environment for pipcook training and prediction. You can directly deploy this image to your deployment host machine, you can also use k8s and other cluster solutions to manage docker images.
  • Cloud service connection: pipcook will gradually connect the machine learning deployment services of cloud service providers in the subsequent development. At this stage, Gcloud has already provided the combination of tfjs and automl, pipcook will gradually support Alibaba Cloud, AWS, and other services in the future.

Comparison with TFX (Tensorflow Extended)

Our ultimate goal is a mature industrial-level machine learning pipeline that can apply excellent models to production environments. In fact, to meet this demand, google officially released TFX based on long-term practice and launched the open source project. Then there may be some questions about the difference between what we are doing and TFX. In fact, the core of pipcook is not to replace any other frameworks, especially products based on the python ecosystem, because the mission of pipcook is to promote the intelligent development of the front end, the technology stacks and productization methods adopted by pipcook are all front-end oriented

  • TFX uses DAG because it involves data generation, statistical analysis, data validation, data conversion, and other operations. These operations can be combined freely. In fact, for many front-end domain scenarios, we do not need many complex combination operations. Therefore, pipcook uses the pipeline Method to abstract data operations into simple pipeline plug-ins, so that the cost of front-end engineers can be reduced.
  • TFX uses Apache Airfow for scheduling, while pipcook uses the front-end technology stack for such operations. For example, we use responsive frameworks such as Rxjs to respond to and connect different plugins, convenient front-end understanding and contribute code
  • At the same time, the API we designed is also based on the habit of JS, and the cost of learning and getting started for the front end is low.

Based on the above designs, we try our best to build a front-end friendly machine learning environment to achieve our expectations and goals.

Future Outlook

Pipcook has been open-source for a month. During this period, it also received feedback from some users. According to our plan, we still have a lot of things to do, we also hope that pipcook can be continuously improved with the help of the open-source community to truly help the front-end intelligence.

  • Work with cloud service providers (Alibaba Cloud, AWS, and Gcloud) to connect pipcook to machine learning deployment for cloud services.
  • Perfect ecology, build pipcook’s trial square and so on to reduce users’ starting costs
  • Better support for distributed training
  • With rich plug-ins and perfect models, it supports more pipelines

In the future, we hope to combine the power of Alibaba’s internal front-end intelligence team and the entire open-source community to continuously improve the front-end intelligence campaign represented by pipcook and pipcook, so as to make the front-end intelligence technology solutions inclusive, precipitate more competitive samples and models, and provide intelligent code generation services with higher accuracy and availability; effectively improve front-end R & D efficiency, reduce simple repetitive work, and do not work overtime, let’s focus on more challenging work!

How to contribute?

If you are interested in our project and want to contribute to front-end intelligence, welcome to our Github open source repository

--

--