Balancing Performance and Sustainability: How Plastic Origins chooses its AI solution for mobile devices

6 min readFeb 16, 2023

Plastic Origins app to map plastic litter in river banks

Plastic waste is a growing problem that affects rivers, oceans, and wildlife all over the world. To address this issue, Surfrider has developed a mobile application called Plastic Origins. An automated version of the app uses artificial intelligence (AI) to detect and identify different types of plastic waste on the banks of rivers. But, how does the app work? And, how does it process the information? In this article, we will explore the technical choices behind the development of Plastic Origins, specifically the decision of whether to run the AI system on the volunteer’s mobile device or on a remote server. We will see that this decision is not trivial and depends on a number of factors, such as user experience, development costs, and environmental impact.

Embedded and Offloaded

This is a question that many mobile applications involving artificial intelligence have asked. For example, do you know if the Shazam app, which recognizes music, or PlantNet, which allows you to identify a plant from a photo, works on the user’s device (embedded) or run on a remote server (offloaded)? And also, does it matter?

Originally, the answer was that PlantNet works by sending data to a server, just like Shazam. However, in the case of PlantNet, the image is directly sent to the server (which can be a fairly heavy operation), while in the case of Shazam, the phone does some of the calculations, then sends only a lightweight “fingerprint”, a sort of compression of the recorded sound. Note that for PlantNet, they recently introduced a completely offline mode.

From the user’s point of view, the more calculations can be done on the phone, the more responsive and therefore pleasant to use the application will be. Indeed, offloading the calculation requires sending an image, then waiting for the server’s response and finally processing that response in the application, which, when for example, we have a weak internet connection can become slow. For an application where calculations must be done in real time, it becomes almost always essential to have an embedded system.

On the other hand, from the point of view of the application’s development, it turns out that embedding an AI system in a phone (or rather in several types of phones if we want it to work on several types of iPhone, Samsung and other models) is much, much more complex to develop than creating an offloaded service available on a server. Also to be able to know if it is better to Embed or Offload the calculation engine, a whole set of criteria are to be taken into account, and the rest of this article goes through this exercise, in the case of Plastic Origins.

Plastic Origins

The Plastic Origins system involves detecting waste from a video stream and GPS positions. The first system setup, offloaded, works as follows:

the user films the river bank, and the app records the GPS trace
When the session is over, the video is sent to the servers, so that the AI can detect and track the litter items on it
The litter items and their position are then inserted into a database to be analysed and sent to a cartography app.

More information can be found in the first article. A demo of this system is available below:

Surfrider - a Hugging Face Space by Surfrider

Test the Plastic Origins Offloaded demo

huggingface.co

A few things are specific to Plastic Origins: A video stream is a much heavier object than sound (in the case of Shazam) or an image (in the case of PlantNet). Sending a video to a server, when on a river, can be complex. The processing of this stream involves an object detection system, which is heavy in calculations, then a specific object tracking system for a video whose camera moves, and complex to develop.

The detection of waste in the bushes of the banks is a difficult task that depends on the position and stability of the camera (and therefore of the kayak), and the distance to the bank. In these conditions, volunteers need to have feedback on the application to know if the waste has been taken into account and adjust their behavior. Quick feedback is also motivating for the playful adoption of the application, which is a critical parameter for a participatory science project. Finally, the video recording of large portions of rivers make huge video files, and is not practical.

Embedded version

In these conditions, it was decided to make an Embedded version of the application. First, the drawbacks linked to this decision:

New complex developments to be done
Need to have different versions of AI depending on the phone models, which work differently
Will not work on all phones (too old phones will not be able to perform the calculation)
Uses more battery

But the pros make up for it:

Enables real-time feedback, and therefore improving the user side of the application, and different user usages
Enables offline mode, or uses with a weak internet connection
Server-side service much simpler, cheaper and less energy-consuming
Does not require the transmission and storage of potentially sensitive, private or voluminous data

Environmentally

What is the best solution from an energy point of view (and therefore greenhouse gasses emissions)? The answer is not obvious depending on the cases, but in the case of Plastic Origins, it seems that the embedded solution is much better: the embedded calculation system is much more efficient and frugal than the AI on the server, which also requires a complex production infrastructure (deployment of servers dedicated to calculation, processing queue, video storage system, video transmission system) and significant network transfers. However, it is important to note that the overall usage and technological footprint must be taken into account. If the system only works on the latest smartphones, it is neither democratic nor frugal, as it pushes users (even indirectly) to change to a newer smartphone.

Technically

One of the difficulties of such a project is managing complexity, to ensure that the project can be maintained and run in the long term (for example by an open-source community). Thus, each technical decision that makes the project more complex, or requiring more specific knowledge, has a cost that must be integrated into the decision-making. The addition of an Embedded component is a poignant example: it requires specific skills that are not widely available (development of very specific calculation engines on native mobile applications, mastered neither by mobile application developers nor by artificial intelligence developers specializing in server-side AI). The decision to go to an Embedded system is therefore conditioned by the availability of people mastering this specific aspect. However, once passed this difficulty, the overall system is potentially less complex because the server side will be much simpler to maintain and less expensive. In our case, we use Mediapipe and Tensorflow Lite conversion of our detection models, which are well suited, but there’s so much obscure bugs to solve before you actually have a fast, reliable app…

TensorFlow Lite | ML for Mobile and Edge Devices

A deep learning framework for on-device inference. Train and deploy machine learning models on mobile and IoT devices…

www.tensorflow.org

Final Thoughts

In addition to technical and environmental considerations, it’s important to consider the societal impact of the choice between an embedded or offloaded AI system. An embedded system can increase accessibility, especially for users in remote or low-connectivity areas, who can still use the application offline. On the other hand, an offloaded system can be more beneficial for users with lower-end devices and can also reduce the e-waste generated by the need to upgrade to newer devices. Additionally, it is important to consider the privacy and security of the data, as offloading the computation to a server can result in the data being transmitted and stored on a third-party server, which may raise concerns about data protection and privacy. It’s always important to consider the broader societal impact of these choices.