Object detection in Android using React Native & TensorFlow.js

Ricky H. Putra
The Startup
Published in
4 min readJun 14, 2020


With the advance of AI and computing device and smart phone, it is easier for us to build and deploy machine learning application today. Many frameworks and tools are available to build sophisticated mobile application with AI enabled such as React and Tensor Flow.

In this post, I am using React Native and Tensor Flow, two very popular frameworks use to build mobile application and have been used by companies like Facebook and Google. OK, let’s get started.

First, clone my sample project here:

$ git clone https://github.com/rickyhp/obj-detection.git
$ cd obj-detection
$ yarn install

You should have the Expo project ready in less than a minute, to start running just type below command:

expo start

You can run the app with Android simulator and see the result.

Now, I will walk through the main pieces of the code for you to understand further.

Open App.js and you should find the first section on top which we import all the necessary libraries:

import React, { Component } from 'react';
import { StyleSheet, Text, View, StatusBar, ActivityIndicator, TouchableOpacity, Image } from 'react-native'
import { ScrollView } from 'react-native-gesture-handler';

import Constants from 'expo-constants'
import * as Permissions from 'expo-permissions'
import * as ImagePicker from 'expo-image-picker'
import * as FileSystem from 'expo-file-system'

import * as jpeg from 'jpeg-js'

import * as tf from '@tensorflow/tfjs';
import '@tensorflow/tfjs-react-native';

// import * as mobilenet from '@tensorflow-models/mobilenet'
// You can try with other models, see https://github.com/tensorflow/tfjs-models
import * as cocossd from '@tensorflow-models/coco-ssd'

The first 3 imports are standard react native UI component libraries, then follow by 4 imports related with Expo framework. Jpeg-js is used to decode raw image data to 3d array tensor object as it is required by our model. Next 2 imports for tensorflow.js and tensorflow.js for react native as we are building React Native app. Last import is the model we use to predict the image.

There are three main logics in our sample code:

  1. Preparing TensorFlow engine
  2. Loading model
  3. User to choose image for model to predict, model does the prediction and result will be printed below the image

Step 1: Preparing TensorFlow engine

await tf.ready();
this.setState({ isTfReady: true,});

Step 2: Loading model

this.model = await cocossd.load();
this.setState({ isModelReady: true });

Step 3: User to choose image for model to predict, model does the prediction and result will be printed below the image

try {
let response = await ImagePicker.launchImageLibraryAsync({
mediaTypes: ImagePicker.MediaTypeOptions.All,
allowsEditing: true,
aspect: [4, 3]

if (!response.cancelled) {
const source = { uri: response.uri }
this.setState({ image: source })
} catch (error) {

detectObjects method is our main prediction logic

detectObjects = async () => {
try {
const imageAssetPath = Image.resolveAssetSource(this.state.image)

const imgB64 = await FileSystem.readAsStringAsync(imageAssetPath.uri, {
encoding: FileSystem.EncodingType.Base64,
const imgBuffer = tf.util.encodeString(imgB64, 'base64').buffer;
const raw = new Uint8Array(imgBuffer)
const imageTensor = this.imageToTensor(raw);
console.log('imageTensor: ', imageTensor);
const predictions = await this.model.detect(imageTensor)

this.setState({ predictions: predictions })

console.log('----------- predictions: ', predictions);

} catch (error) {
console.log('Exception Error: ', error)

In my case, I use FileSystem.readAsStringAsync to read the image from local storage. It is read as String object with base64 encoding type, we need to convert it into Uint8Array using jpeg-js decoder and then created a 3d tensor object.

imageToTensor(rawImageData) {
const TO_UINT8ARRAY = true
const { width, height, data } = jpeg.decode(rawImageData, TO_UINT8ARRAY)

const buffer = new Uint8Array(width * height * 3)
let offset = 0 // offset into original data
for (let i = 0; i < buffer.length; i += 3) {
buffer[i] = data[offset]
buffer[i + 1] = data[offset + 1]
buffer[i + 2] = data[offset + 2]

offset += 4

return tf.tensor3d(buffer, [height, width, 3])

For rendering part, we create a method renderPrediction which take the result from our model.detect function. And since we are using COCO Single Shot MultiBox Detection (capable of identifying 90 classes of objects) here, it returns an array of bounding boxes with class name and confidence level.

renderPrediction = (prediction, index) => {
const pclass = prediction.class;
const score = prediction.score;
const x = prediction.bbox[0];
const y = prediction.bbox[1];
const w = prediction.bbox[2];
const h = prediction.bbox[3];

return (
<Text key={index} style={styles.text}>
Prediction: {pclass} {', '} Probability: {score} {', '} Bbox: {x} {', '} {y} {', '} {w} {', '} {h}

Here we go, the result of sample prediction running in Note10+ which takes less than 2 seconds to show.

Please follow me if you find this article useful, it would motivate me to write more useful articles and helping others to learn. Thank you.



Ricky H. Putra
The Startup

Leading digitization initiatives in AwanTunai focusing on strengthening Indonesia MSME businesses with technology. Software Dev | Automation | Data Science | AI