How we navigated running machine learning models on Edge device.

Published in

Eumentis

6 min readMar 22, 2024

Navigating through a digital labyrinth. Credits: Adobe firefly

In this article, we’ll talk about how we run machine learning models on Edge devices, the challenges we face, and how we overcome them.

We are well acquainted with running machine learning models on web devices. Porting and running the same models on Edge devices is often a challenge due to various reasons, such as computation power, speed, model size, supported libraries on Edge devices, etc. Additionally, there are a vast number of libraries supporting web operations, which makes tasks more straightforward on web devices compared to mobile devices.

A more significant challenge arises when we need to integrate our machine learning pipeline with the framework we’ve used to build our Android and iOS apps.

Our use case

We had a pipeline of two models: image classification and object detection, running sequentially, one after the other. We used PyTorch to build a classification model and Yolov8 by Ultralytics to build the object detection model. Image classification helped in filtering unwanted images with poor quality or where the object to be detected was not well positioned. The filtered images were then passed to the object detection model. Since the object to be detected was significantly smaller compared to the image size, we divided the image into tiles. (Tiles of size 960 * 960 pixels). The tiling process required a substantial amount of calculations related to height, width, and squaring of the image to ensure the tiling process proceeded smoothly.

We aimed to replicate the same steps on mobile devices. We had built an app in react-native, and now we were looking to integrate the react-native app with a machine learning pipeline.

Before starting our work, we first explored the machine learning options available for mobile devices. Sharing key points from our research:

Tensorflow-Lite and ML kit

TensorFlow-lite and ML kit by Google are tools that have made on-device machine-learning model deployment easy.
TensorFlow-lite is an open source framework to run TensorFlow based models on mobile devices, and IOT devices.
TensorFlow-lite enables on-device machine learning by addressing 5 key constraints: latency, privacy, connectivity, size, power consumption
All TensorFlow models cannot be converted to lite models. There are some limitations.
ML kit is a mobile machine learning framework that provides pre-trained machine learning models and APIs that can be used in mobile applications for various tasks.

For IOS devices

We have Core ML and Create ML tools built by Apple for IOS devices only.
Core ML like TensorFlow-lite is optimized on on-device platforms.

ONNX (Open Neural Network Exchange)

It’s a format built to represent machine learning models.
With ONNX, we can convert machine learning models from one framework to another, taking advantage of different frameworks.

TensorFlow Lite and PyTorch Mobile are libraries designed to assist in converting models built in their web counterparts into their respective mobile versions. However, there are situations where we may have constructed models in different web frameworks and, depending on our mobile app framework or other considerations, require a different framework. In such cases, ONNX comes to our rescue, facilitating the conversion of models from one framework to another.

PyTorch Mobile

Enables PyTorch models to be deployed on mobile applications.

When planning the implementation on mobile devices, there are several considerations to keep in mind:

Data Pre and Post-processing: On the web, we often perform data pre and post-processing using libraries like OpenCV and NumPy. However, these libraries may not be directly available in the mobile framework, so alternative approaches or libraries may be needed.
Choice of Language: The programming language used for web scripts becomes important, as not all web libraries and frameworks are easily transferrable to mobile platforms.
Accuracy vs. Constraints: It’s essential to define the desired accuracy for your mobile model and consider any time constraints. Smaller models are often preferred on mobile devices due to factors like app size and processing speed, but this may result in a drop in accuracy.
Memory Management: Unlike web devices, mobile devices have more limited memory resources. Larger model sizes and complex computational code can lead to memory-related issues and even cause the mobile app to crash.

Our primary focus was on the object detection model and how we could replicate the pre-processing steps we had done for it using react-native supported libraries. Our main operations involved cropping the image and overlaying one image on top of another. We used opencv-python to perform these tasks on the web. As Python is not directly integrated with react-native, we had to utilize OpenCV for Android and IOS, which introduced an extra layer of complexity and required learning Swift for iOS. Consequently, we had to explore alternatives that would allow us to execute OpenCV operations on mobile devices within the React Native app.

Some useful articles we found while exploring Opencv for Android and iOS.

Opencv for Android

Opencv for IOS

We began by exploring well known libraries for machine learning applications, namely Tensorflow and PyTorch. Tensorflow Lite and PyTorch Lite are mobile-optimized versions of these libraries. While there is Tfjs-react-native, a TensorFlow library for using TensorFlow operations in React Native, we found it to be less robust. We did not find many examples demonstrating yolov8 implementation and functions for cropping and overlaying. However, we discovered a library called react-native-pytorch-core in the PyTorch ecosystem that had integration with React Native.

Since our object detection model was built using the YOLOv8 framework, which is built in PyTorch, and react-native-pytorch-core had good documentation for implementing YOLOv5 models, we decided to delve deeper into it. Our main focus was to determine if we could implement the OpenCV operations of cropping and overlaying using react-native-pytorch-core. By examining the functions, it was challenging to ascertain whether it could be used for this purpose. Consequently, we chose to begin implementing the operations line by line and figure it out as we went along.

During implementation, we couldn’t find any methods to implement overlaying and cropping operations. Our overlay operation involved creating a blank white image and placing the original image on top of it. Further research led us to a library in React Native called react-native-photo-manipulator, which enabled us to perform both overlaying and cropping operations.

After implementing overlaying and cropping operations using react-native-photo-manipulator, and performing operations like loading the image, converting it to a tensor, and loading the ML model with the assistance of react-native-pytorch-core, we encountered an issue where the app would crash after processing 4 to 5 tiles of an image. Every tile processed would keep adding to the memory which eventually led to the crash. Despite trying various options, none of them seemed to work. Ultimately, the problem was resolved by changing the React Native engine to Hermes.

The next challenge emerged when creating a blank image onto which we would overlay our original image. Generating the blank image led to a significant increase in memory usage. The workaround we implemented was to bundle a fixed size white image with the app and crop it to the desired size rather than creating it separately during the inference process.

With this, we concluded to go ahead with the react-native-pytorch-core and react-native-photo-manipulator libraries.