Unlock AI Power in Your Hybrid Mobile App: Local Embedding of HuggingFace Model with Transformers.js

MPyK
The Web Tub
Published in
6 min readMar 5, 2024
Credit to https://huggingface.co/blog/optimum-onnxruntime-training

In previous articles, we demonstrated how to utilize HuggingFace models via API endpoint calls in mobile applications. While this method provides a quick and easy way to test out the model, it can be unstable due to the unpredictability of free endpoints. Therefore, in this tutorial, we will introduce an alternative approach: downloading the model from Huggingface and running it locally with Transformers.js.

Get Started

First, let’s clone this GitHub repository to your local directory. Then, we’ll delve into the key components and codebase of the project. The application we’re developing is a text classification or sentiment analysis app. Users input a sentence, and the AI assesses whether it’s positive/negative or rates the text on a scale of 1 to 5 stars.

The App — Text Classification

The HTML

The application we’re constructing is refreshingly straightforward. Let’s start by examining the index.html file.

<!DOCTYPE HTML>
<html>
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1, maximum-scale=1, user-scalable=no">
<meta http-equiv="Content-Security-Policy" content="default-src * data: gap: content: https://ssl.gstatic.com; style-src * 'unsafe-inline'; script-src * 'unsafe-inline' 'unsafe-eval'">
<script src="components/loader.js"></script>
<link rel="stylesheet" href="components/loader.css">
<link rel="stylesheet" href="css/style.css">
</head>
<body>
<!-- Loading image -->
<div id="loading">
<img src="loading.gif" alt="Loading..." width="250" />
</div>

<h1>Hugging Face</h1>

<!-- Result section -->
<div id="result">
This is result.
</div>

<!-- Query section -->
<label>Query:</label>
<textarea id="query" placeholder="I like Monaca">I like Monaca</textarea>

<!-- buttons -->
<button id="btnSentimenLocal">Text Classification (Local model file)</button>
<button id="btnSentimentCache">Sentiment Analysis</button>

<script src="main.js" type="module"></script>
</body>
</html>

As you can see, our interface is divided into four sections. Firstly, the loading image is displayed when the model is loading or processing data. Following that, the “result” section showcases the application status and displays results from the model predictions. Moving on, the “query” section allows users to input their text. Finally, we have two buttons: the “Text Classification” button utilizes local model embedding within the application, while the “Sentiment Analysis” button leverages the model from the HuggingFace cloud.

The Javascript

As you may have noticed in the HTML file above, we also import the JavaScript file main.js, which houses the logic for handling user interactions and executing the AI functionalities. It utilizes the ES6 module syntax (type="module"). Now, let's delve deeper into its components.

import { pipeline, env } from 'https://cdn.jsdelivr.net/npm/@xenova/transformers@2.15.1';

// global var
let sentimentPipeline = null;
let textClassificationPipeline = null;
let startTime;
const resultEl = document.getElementById('result');
const loadingImage = document.getElementById('loading');

const updateResult = (html) => {
resultEl.innerHTML = html;
console.log(html); // debug purpose
};

const start = () => {
loadingImage.style.display = 'block'; // show loading image
updateResult('Loading ...');
startTime = performance.now(); // calculate processing time
}

const predict = async (type, pipelineModel, sentence) => {
if (!sentence) {
loadingImage.style.display = 'none';
alert('Input some query');
return;
}
if (!pipelineModel) {
alert('The model is not loaded.');
loadingImage.style.display = 'none';
return;
}
updateResult('Processing ...');
try {
// process sentence and return result from model pipeline function
const result = await pipelineModel(sentence);
const endTime = performance.now(); // calculate processing time
if (result) {
updateResult(`
<p>${JSON.stringify(result)}</p>
<p>Time taken: ${((endTime - startTime) / 1000).toFixed(2)} seconds (${type}).</p>
`);
}
} catch (e) {
updateResult(e.toString());
}
loadingImage.style.display = 'none'; // hide loading image
};

const getBrowserCachePipeline = async (useRemote, taskName, modelName) => {
let piplelineModel;
env.allowRemoteModels = useRemote; // true: use HuggingFace cloud model. false: use downloaded model in browser cache
try {
piplelineModel = await pipeline(
taskName,
modelName
);
} catch (e) {
console.log(e);
piplelineModel = null;
}
return piplelineModel;
};

const useBrowserCache = async (pipelineCache, taskName, modelName) => {
env.useBrowserCache = true;
if (!pipelineCache) {
updateResult('Loading model from cache...');
pipelineCache = await getBrowserCachePipeline(false, taskName, modelName);
if (!pipelineCache) {
updateResult('Cache not available. Loading remote model...');
pipelineCache = await getBrowserCachePipeline(true, taskName, modelName);
}
updateResult('Model is loaded.');
}
await predict('browser', pipelineCache, document.getElementById("query").value);
return pipelineCache;
};

const useLocalModel = async (pipelineCache, taskName, modelName) => {
env.localModelPath = './models/';
env.allowRemoteModels = false;
env.allowLocalModels = true;
env.useBrowserCache = false;
if (!pipelineCache) {
updateResult('Loading model from local...');
pipelineCache = await pipeline(
taskName,
modelName
);
updateResult('Model is loaded.');
}
await predict('local', pipelineCache, document.getElementById("query").value);
return pipelineCache;
};

const sentimentClassifyLocalModel = async () => {
start();
// this model doesn't have onnx model. so we use the converter tool to convert from pytorch model to onnx and upload to www/local folder.
textClassificationPipeline = await useLocalModel(
textClassificationPipeline,
'text-classification',
'local/nlptown/bert-base-multilingual-uncased-sentiment'
);
};

const sentimentClassify = async () => {
start();
// this model has onnx model file. so we can use it directly from HuggingFace. It will first download it then cache it.
sentimentPipeline = await useBrowserCache(
sentimentPipeline,
'sentiment-analysis',
'Xenova/distilbert-base-uncased-finetuned-sst-2-english'
);
};

const main = () => {
document.getElementById("btnSentimentCache").addEventListener('click', sentimentClassify, false);
document.getElementById("btnSentimenLocal").addEventListener('click', sentimentClassifyLocalModel, false);
};


main();

Transformers.js is designed to mirror Hugging Face’s transformers Python library, providing functionality that allows you to run the same pretrained models using a very similar API. For a comprehensive understanding of Transformers.js, please refer to the full documentation.

Transformers vs Transformers

The only caveat is that Transformers.js currently supports ONNX models, leveraging ONNX Runtime to execute models in browsers. However, we’ve got you covered with a converter tool that can easily convert pretrained PyTorch, TensorFlow, or JAX models to ONNX format. We’ve provided a Google Colab script for your convenience, so simply sign up for a free account, specify the desired model for conversion, and run the script. If everything goes smoothly, you should obtain results similar to the example below. You can then download the folders and upload them to your project directory. For this application, we’ve uploaded them to the “www/models/local” directory.

ONNX Converter

Some Explanation

First things first, to utilize the Transformers library, we need to import the package into the project.

import { pipeline, env } from 'https://cdn.jsdelivr.net/npm/@xenova/transformers@2.15.1’;

To use the model locally, we need to set up an environment variable (env.localModelPath) specifying the path to load the model from. Additionally, we instruct Transformers to utilize the model locally, rather than remotely or from the browser cache.

env.localModelPath = './models/';
env.allowRemoteModels = false;
env.allowLocalModels = true;
env.useBrowserCache = false;
The App — Sentiment Analysis

To utilize the model remotely or from the browser cache, we also need to configure the ‘env’ settings. Setting ‘env.allowRemoteModels’ to true enables the use of remote models, while setting it to false directs the application to use the model stored in the browser cache (provided ‘env.useBrowserCache’ is set to true). When using the browser cache, the model is initially downloaded from the cloud, and subsequent usage relies on the cached version. It’s important to note that for this setup to work, the model uploaded to Hugging Face must include ONNX model files in the repository.

The remaining code is dedicated to displaying the application status, loading image, and other related elements. It is designed to be simple and straightforward in its implementation.

Running On Mobile

If you want to make it as mobile application, you can sign up a free account with Monaca and import this project. You can then run it on both android and iPhone devices in no time.

Conclusion

In conclusion, this tutorial has provided a comprehensive overview of embedding a Hugging Face AI model into a hybrid mobile application using Transformers.js.

Furthermore, we addressed the limitations of Transformers.js in supporting ONNX models and provided a solution by offering a converter tool and guiding users through its usage. By following the steps outlined in this tutorial, developers can seamlessly integrate powerful AI capabilities into their hybrid mobile applications, opening up new possibilities for enhancing user experiences and functionality.

Overall, this tutorial serves as a valuable resource for developers looking to leverage the capabilities of Hugging Face models in their mobile applications, providing clear instructions and insights into the process of local model embedding and usage.

Happy Coding.

--

--