My serverless Resnet-50 classifier
I had an idea to develop an image classification app on the cloud, as wouldn’t it be cool to give the functionality to allow users to perform classification on their own personal images? I posed a challenge to myself to do this serverlessly in AWS Lambda, without using any EC2s/VMs to perform the inference task. This wasn’t a simple task, as most ML inference is done using a VM, but I opted to do it serverlessly to lower my costs.
I decided to start with the backend, and took inspiration from this blog post by AWS https://aws.amazon.com/blogs/machine-learning/using-container-images-to-run-tensorflow-models-in-aws-lambda/. Essentially, I would package a containerised image of the famous Resnet-50 image classification model, and deploy this to AWS’ ECR container registry. This lambda function (henceforth referred to as tensorflow-inference) would then run this container image whenever an image is uploaded on S3, which would trigger an event that would run tensorflow-inference.
So that is all for the backend solution for running ML inference serverlessly using AWS lambda, easy enough so far. Now the challenge would be to get the backend to communicate asynchronously with the frontend of the app, as the function could take up to 40sec when running from a cold start! (This is one of the cons of doing ML inference serverlessly. If the traffic to the app is low, AWS shuts down the lambda image after a few minutes of inactivity, and the cold start takes a while to load the large tensorflow package)
Enter Websockets! I had to learn how to use the websocket API, to initiate a two way communication between the frontend and my backend through AWS API gateway.
Here is the full flow of events for the classifier:
- Using HTTP file upload, the user selects an image file to perform inference on.
- The react frontend triggers a lambda function that gets a presigned URL from S3 in order to upload the photo to S3. It also gets another URL to display the image after it is uploaded onto S3.
- Upon uploading the photo to S3, tensorflow-inference automatically starts performing inference on the image. Concurrently, the frontend opens a websocket connection with the backend through API gateway.
- Another lambda function is triggered to save the websocket connection ID in a DynamoDB database, and uses the filename as a partition key so that tensorflow-inference can find the websocket connection ID later.
- After tensorflow-inference completes inference, it searches DynamoDB to find the connection ID to return the results back to the frontend using websockets.
- The react frontend parses a JSON message of the inference results, displays it in the browser, and then terminates the connection.
There you go, a serverless image classification app deployed fully on the cloud, running at less than a dollar a month! I only need to pay for the storage costs for the docker image on AWS ECR. Some further improvements in the future would be to include other image classification models besides resnet-50, which would allow users to compare the performance of the different neural networks on their own images. I could also include a rating feature so that I can collect feedback and perform some analysis on which are the best performing networks!
Check out my code at https://github.com/wootoodoo/resnet50-classifier, and feel free to clone it and play around with it!