Scalable image classification with ONNX.js and AWS Lambda

4 min readSep 1, 2019

In this article, I show you how to build a scalable image classifier on AWS using ONNX.js and the serverless framework. ONNX is an open-source standard that serialises models. If you are not familiar with ONNX, feel free to check out my introduction article to ONNX: Portability between deep learning frameworks — with ONNX.

Deploying an ONNX model on AWS Lambda with Node.js and ONNX.js

The model — ResNet50

We are using the ResNet architecture for the image classification. In 2015 ResNet won the image classification task at the annual software competition ILSVRC.

Instead of training the network from scratch, we are using a trained model. The PyTorch framework provides a trained ResNet. The model was trained on 1,000 classes of the ImageNet database. In total, the database includes images from over 20,000 categories. We need to load the model and convert it to ONNX. While writing the code for this article some weeks ago, I encountered an issue when converting the model from PyTorch to ONNX. The ONNX specification does not support one of the operations that were used in the forward function. However, after doing some simple changes to the forward function, I was able to export the model to ONNX. You can find the changes in this code. One thing that I would like to highlight here is that we need to create a dummy input when we are exporting the model from PyTorch to ONNX.

Inference with ONNX.js

For the inference, we are using ONNX.js — a javascript library to run ONNX models. We can use the library within the browser and Node.js. The library supports CPU as well as GPU. On the CPU, the library uses WebAssembly. On the GPU, the library uses WebGL. Running a prediction with the library is very simple:

AWS Lambda and Serverless

AWS Lambda is a service offered by Amazon Web Services (AWS) to execute code when specified events, e.g. REST request occurs. The developer only has to write the code, while AWS takes care of the commissioning and scaling.

Serverless means that the management of the servers running the applications is delegated to a vendor such as AWS. The developer only needs to run the code while technical tasks — such as configuring the firewalls, setting up virtual machines and patching the operating system — are managed by AWS. In addition to the execution, the Lambda service handles the scaling and the availability of the function. Compared to the conventional scaling of servers, AWS Lambda has the advantage that the costs only arise for the use, but not for the idle time of the servers.

Developing Lambdas with Serverless Framework

A lambda can be deployed in different runtime environments. In our case, we are choosing a Node.js environment to use ONNX.js. Additionally, we are using the Serverless Framework. The Serverless Framework is one of the best-known toolkits for developing serverless applications and simplifies the deployment process of the lambda by automating the deployment.

We use npm to install the Serverless Framework. The command serverless create — template aws-nodejs — path image_classifier creates a blueprint for the project. In the directory image_classifier, the two files serverless.yml and handler.js are created. The lambda functions are registered in the YML file. Also, we are storing the configuration of the lambdas within the YML file.

Tutorial: Lambda, Node.js and ONNX.js

We are developing a Lambda function that can classify the top 5 categories for one image. We transmit the image as base64 encoded in a POST request to the Lambda endpoint. Before, we are running the classification on the endpoint we have to do some preprocessing:

Base64 decoding of the image:

2. Resizing of the image:

3. Normalisation of the image: We are using the ndarry library for the normalisation and transform of the data into a multidimensional array using the ndarray library. Firstly, we create a ndarry from ImageData. In addition to the width and height, we have to specify the stride of 4, which describes the number of channels RGBA. The pixel values are normalized between 0 and 1 by dividing 255. Additionally, the RGB channels are normalized with the mean values [0.485, 0.456, 0.406] and the standard deviations [0.229, 0.224, 0.225]. We subtract the mean values from the respective colour channel and divide by the standard deviation. After these steps, we have an array with the form [224,224,4]. This must be transformed to [1,3,224,224] for the model.

Putting it together: handler.js

After we have implemented the preprocessing, we can put the scripts together in one endpoint. You can find the full code in the repository: aws-lambda-image-classifier-onnx. You can test the application with the test_lambda.sh script.

Summary

In this article, we looked at the deployment of an ONNX model on an AWS Lambda. The first step was to convert a trained ResNet from PyTorch to ONNX. The converted model is then used with ONNX.js to execute the inference. The calculations of the inference run on WebAssembly or WebGL. The Serverless Framework helps us to deploy the Node.js application smoothly as AWS Lambda. With AWS Lambda, the application scales as needed, and we don’t have to worry about running the application.

AWS Lambda is an excellent alternative to EC2 instances for running inference, depending on the use case. Please note that due to the RAM limit (< 3GB) of AWS Lambda, not every model can be put into operation. Another limitation is the ONNX standard, which does not necessarily support every operation and data type. This makes it possible that some models cannot be converted.

Feel free to join the Machine Learning in Production LinkedIn Group to learn more about the development and deployment of machine learning applications.

The German version of this post can be found here. Check out more posts on deep learning on our blog.