How to Use an Object Detection Model in a React App with TensorFlow Serving

Our team built a proof of concept that uses a mobile web application with a video stream and a custom-trained multiple object detection model to filter relevant content for customers in retail stores.

Proof of concept video

A web application like this can be useful as it allows customers to immediately gain information about multiple products simultaneously. Expeditious in-store shopping is a priority for customers and retailers, and customers also want easy access to product information. Additionally, a web application will not contribute to customer app fatigue because there is no download requirement.

You can read my coworker’s blog on how we retrained the ssd_mobilenet_v2_coco model to recognize three types of sneakers.

Once we have our model, we want to use it in an application context. There is a JavaScript Library called TensorFlow.js that allows a model to be imported in a web app; however, we decided to let TensorFlow handle the model serving with TensorFlow Serving. Let’s setup TensorFlow Serving in our React application.

Setup TensorFlow Serving

The target architecture has the React app making REST API calls to a model provided by TensorFlow Serving. TensorFlow Serving is a serving system that provides out-of-the-box integration with TensorFlow models. Docker manages the TensorFlow Serving environment using TensorFlow Docker images for serving.

Setup overview (cloud hosting optional)

Following this helpful GitHub thread and the TensorFlow configuration instructions, we wrote a Dockerfile to serve our custom-trained model. The Dockerfile pulls the TensorFlow Serving Docker image, moves the saved_model.pb file and empty variables folder into the v2-trained-ocr-detector/1 folder, and runs a TensorFlow ModelServer.

Folder structure of the model files (note that it is fine for the variables folder to be empty)
# Dockerfile FROM tensorflow/servingWORKDIR /models
COPY all_saved_models .
COPY model_server.conf .
RUN mkdir -p v2-trained-ocr-detector/1
RUN mv ./v2_trained_model/v2_trained_model.pb ./v2_trained_model/saved_model.pb
RUN mv ./v2_trained_model/saved_model.pb v2-trained-ocr-detector/1
RUN mv ./v2_trained_model/variables v2-trained-ocr-detector/1
ENTRYPOINT ["tensorflow_model_server", "--model_config_file=/models/model_server.conf", "--rest_api_port=8501", "--port=8081"]

The Model Server configuration file referenced in model_config_file specifies the name and path of the models to be served.

# model_server.confmodel_config_list {
config {
name: 'v2-trained-model'
base_path: '/models/v2-trained-ocr-detector'
model_platform: "tensorflow"

In the terminal, build the tf-serving image and run it as a container named “tf-serving” with port 8501 exposed for the REST API.

docker build -t tf-serving .docker run -p 8501:8501 -p 8081:8081 -d tf-serving

Now that TensorFlow Serving is running, we can use it to predict objects in our app.

Retrieve Predictions in the React App

Our React app was built using create-react-app. The code relevant to TensorFlow Serving is shown below.

In App.js, the urlHost.urlTF variable points to our TensorFlow ModelServer on localhost and urlHost.model is the name of the model.

// App.js const urlHost = {
urlTF: 'http://localhost:8501/',
model: 'v2-trained-model'

The detection routine, this.detectObjects(), can be called as soon as the video load promise is fulfilled.

detectObjects() makes a POST call to TensorFlow’s Predict API. The video image tensor is transformed into a nested array with arraySync() to conform to the API specs. The prediction output is then used to draw bounding boxes and display the prediction class text.

// App.js import * as tf from '@tensorflow/tfjs'...async detectObjects() {
if (this.state.videoStreaming) {
let imageTensor = await tf.browser.fromPixels(document.getElementById('video'))
let imageTensorArr = imageTensor.arraySync()
let data = {"signature_name": "serving_default", "instances": [imageTensorArr]}
let headers = {"content-type": "application/json"}
let url = urlHost.urlTF + "v1/models/" + urlHost.model + ":predict"
await axios.request ({
url: url,
method: 'post',
data: data,
headers: headers
.then((response) => {
let predictions =["predictions"][0]
this.showDetections(predictions["detection_boxes"], predictions["num_detections"], predictions["detection_classes"], predictions["detection_scores"])
.catch((error) => {
requestAnimationFrame(() => {

This approach works for both the ssd_mobilenet_v2_coco pre-trained model and our custom-trained model.

However, when we run the app locally, the browser throws a CORS error because we are making a POST call to http://localhost:8501 when our app is running on http://localhost:3000. As a temporary solution, we can disable CORS in Chrome with this command in the terminal:

open -a Google\ Chrome --args --disable-web-security --user-data-dir

One way to solve the CORs issue is to use a load balancer on Google Cloud.

Host the App on Google Cloud

A load balancer distributes user traffic across multiple instances and offers a single IP address to serve as the frontend. The diagram below shows the configuration of the external HTTPS load balancer that was created by following the Google Cloud instructions.

External HTTPS load balancer

With this configuration, a user can visit a domain (e.g. and one of the three VM instances will be used depending on the path. The default path /*, or, points to the React app. The /v1/* path points to TensorFlow Serving. As a result, the app can make a POST call to without a CORS error. A HTTPS domain is used because getUserMedia(), which provides video access in the app, requires the browser’s page to be loaded using HTTPS, the file:/// URL scheme, or localhost.

There are two crucial steps when setting up the load balancer.

  1. Create a firewall rule so Google Cloud probers with IP ranges and can connect to backend services.
  2. Modify the TensorFlow Serving backend service health check request-path to /v1/models/v2-trained-model, instead of default /. The default path leads to a HTTP 404 (Not Found) error. If the health check does not receive a HTTP 200 (OK) response, the backend is not eligible to receive new connections. Therefore, the new path calls the Model Status API, which returns:
"model_version_status": [
"version": "1",
"state": "AVAILABLE",
"status": {
"error_code": "OK",
"error_message": ""
TensorFlow Serving backend service health check

Interested in hearing more?

Please contact me or my team member Ronak Bhatia. To read more about Accenture Labs and our R&D areas, visit our website.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store