How to Use an Object Detection Model in a React App with TensorFlow Serving
Our team built a proof of concept that uses a mobile web application with a video stream and a custom-trained multiple object detection model to filter relevant content for customers in retail stores.
A web application like this can be useful as it allows customers to immediately gain information about multiple products simultaneously. Expeditious in-store shopping is a priority for customers and retailers, and customers also want easy access to product information. Additionally, a web application will not contribute to customer app fatigue because there is no download requirement.
You can read my coworker’s blog on how we retrained the ssd_mobilenet_v2_coco model to recognize three types of sneakers.
Once we have our model, we want to use it in an application context. There is a JavaScript Library called TensorFlow.js that allows a model to be imported in a web app; however, we decided to let TensorFlow handle the model serving with TensorFlow Serving. Let’s setup TensorFlow Serving in our React application.
Setup TensorFlow Serving
The target architecture has the React app making REST API calls to a model provided by TensorFlow Serving. TensorFlow Serving is a serving system that provides out-of-the-box integration with TensorFlow models. Docker manages the TensorFlow Serving environment using TensorFlow Docker images for serving.
Following this helpful GitHub thread and the TensorFlow configuration instructions, we wrote a Dockerfile to serve our custom-trained model. The Dockerfile pulls the TensorFlow Serving Docker image, moves the saved_model.pb file and empty variables folder into the v2-trained-ocr-detector/1 folder, and runs a TensorFlow ModelServer.
# Dockerfile FROM tensorflow/servingWORKDIR /models
COPY all_saved_models .
COPY model_server.conf .
RUN mkdir -p v2-trained-ocr-detector/1RUN mv ./v2_trained_model/v2_trained_model.pb ./v2_trained_model/saved_model.pb
RUN mv ./v2_trained_model/saved_model.pb v2-trained-ocr-detector/1
RUN mv ./v2_trained_model/variables v2-trained-ocr-detector/1ENTRYPOINT ["tensorflow_model_server", "--model_config_file=/models/model_server.conf", "--rest_api_port=8501", "--port=8081"]
The Model Server configuration file referenced in model_config_file
specifies the name and path of the models to be served.
# model_server.confmodel_config_list {
config {
name: 'v2-trained-model'
base_path: '/models/v2-trained-ocr-detector'
model_platform: "tensorflow"
}
}
In the terminal, build the tf-serving image and run it as a container named “tf-serving” with port 8501 exposed for the REST API.
docker build -t tf-serving .docker run -p 8501:8501 -p 8081:8081 -d tf-serving
Now that TensorFlow Serving is running, we can use it to predict objects in our app.
Retrieve Predictions in the React App
Our React app was built using create-react-app. The code relevant to TensorFlow Serving is shown below.
In App.js, the urlHost.urlTF variable points to our TensorFlow ModelServer on localhost and urlHost.model is the name of the model.
// App.js const urlHost = {
urlTF: 'http://localhost:8501/',
model: 'v2-trained-model'
}
The detection routine, this.detectObjects()
, can be called as soon as the video load promise is fulfilled.
detectObjects()
makes a POST call to TensorFlow’s Predict API. The video image tensor is transformed into a nested array with arraySync()
to conform to the API specs. The prediction output is then used to draw bounding boxes and display the prediction class text.
// App.js import * as tf from '@tensorflow/tfjs'...async detectObjects() {
if (this.state.videoStreaming) { let imageTensor = await tf.browser.fromPixels(document.getElementById('video'))
let imageTensorArr = imageTensor.arraySync()
let data = {"signature_name": "serving_default", "instances": [imageTensorArr]}
let headers = {"content-type": "application/json"}
let url = urlHost.urlTF + "v1/models/" + urlHost.model + ":predict" await axios.request ({
url: url,
method: 'post',
data: data,
headers: headers
})
.then((response) => {
let predictions = response.data["predictions"][0]
this.showDetections(predictions["detection_boxes"], predictions["num_detections"], predictions["detection_classes"], predictions["detection_scores"])
})
.catch((error) => {
console.log(error)
}) requestAnimationFrame(() => {
this.detectObjects()
})
}
}
This approach works for both the ssd_mobilenet_v2_coco pre-trained model and our custom-trained model.
However, when we run the app locally, the browser throws a CORS error because we are making a POST call to http://localhost:8501 when our app is running on http://localhost:3000. As a temporary solution, we can disable CORS in Chrome with this command in the terminal:
open -a Google\ Chrome --args --disable-web-security --user-data-dir
One way to solve the CORs issue is to use a load balancer on Google Cloud.
Host the App on Google Cloud
A load balancer distributes user traffic across multiple instances and offers a single IP address to serve as the frontend. The diagram below shows the configuration of the external HTTPS load balancer that was created by following the Google Cloud instructions.
With this configuration, a user can visit a domain (e.g. multipleobjectdetection.com) and one of the three VM instances will be used depending on the path. The default path /*
, or https://multipleobjectdetection.com, points to the React app. The /v1/*
path points to TensorFlow Serving. As a result, the app can make a POST call to https://multipleobjectdetection.com/v1/models/v2-trained-model:predict without a CORS error. A HTTPS domain is used because getUserMedia(), which provides video access in the app, requires the browser’s page to be loaded using HTTPS, the file:///
URL scheme, or localhost
.
There are two crucial steps when setting up the load balancer.
- Create a firewall rule so Google Cloud probers with IP ranges 35.191.0.0/16 and 130.211.0.0/22 can connect to backend services.
- Modify the TensorFlow Serving backend service health check request-path to
/v1/models/v2-trained-model
, instead of default/
. The default path leads to aHTTP 404 (Not Found)
error. If the health check does not receive aHTTP 200 (OK)
response, the backend is not eligible to receive new connections. Therefore, the new path calls the Model Status API, which returns:
{
"model_version_status": [
{
"version": "1",
"state": "AVAILABLE",
"status": {
"error_code": "OK",
"error_message": ""
}
}
]
}
Interested in hearing more?
Please contact me or my team member Ronak Bhatia. To read more about Accenture Labs and our R&D areas, visit our website.