Developing Microsoft Edge extensions powered by SoTA NLP models

A step-by-step tutorial on building a Microsoft Edge extension for text paraphrasing using PEGASUS, KeyBERT, and WordNet

Source: Image created by the author.

Bird’s-eye view of how the extension works

Let’s start by giving the extension a name. I will call it Janus AI: Smart Paraphrasing Solution (According to Roman mythology, Janus is the god of change and transitions, and the idea of paraphrasing is to “change” the expression of some text for better clarity).

  • The user can type or paste textual content into the input box.
  • The user can select a piece of text from any tab in the browser and click the extension to automatically populate the input box.

Developing the RESTful API server

Janus AI requires a RESTful API server for the extension to send the user’s input text and receive the paraphrased text along with the keywords and their synonyms. I spin up this server using FastAPI as it is extremely powerful yet very simple to use and has pretty good documentation. The API endpoints of this server use the modules for paraphrasing, keyword extraction, and synonym finding.

Preliminaries

First, I create a folder called server/ for the API server, which contains two files: main.py and paraphrasingModule.py. Then I install the following packages in a virtual environment:

fastapi==0.75.2
keybert==0.5.1
nltk==3.7 # To leverage WordNet for finding synonyms
sentence-splitter==1.4 # To split sentences from input text
sentencepiece==0.1.96 # Required by PEGASUS paraphraser model
torch==1.11.0
transformers==4.18.0
uvicorn==0.17.6 # Lightweight server to run FastAPI code

Module for paraphrasing, keyword extraction and synonym finding

The server/paraphrasingModule.py file contains the PEGASUS paraphraser, KeyBERT keyword extractor, and synonym finder.

  • num_return_sequences is set to 1 to return only one paraphrase per sentence
  • num_beams is set to 10 to allow 10 beams during beam search for selecting alternatives of the input sequence
  • max_length is set to 100 to allow the generation of a maximum of 100 tokens per paraphrased sentence
  • temperature is set to 1.5 to generate more diverse outputs (temperature controls prediction randomness)

RESTful API server with FastAPI

With this, we are set to write the code in the server/main.py file, which loads our paraphrasing, keyword extraction, and synonym finding modules, and serves them as an API endpoint using FastAPI. Basically, we spin up a server and offer the endpoint POST /paraphrase for our extension to send the user’s text input. It returns a JSON object containing the paraphrased text and the synonyms of the extracted keywords from the text as a response.

Testing of the POST /paraphrase endpoint in the RESTful API server by going to http://localhost:8000/docs

Developing the Edge extension

Microsoft Edge is built on Chromium. Hence, most of the stable chrome APIs (used for developing Chrome extensions) can also be used for developing Edge extensions. The complete list of supported APIs can be found in this documentation.

  • An app manifest JSON containing the configuration of the extension
  • HTML (and CSS) file(s) defining the user interface
  • JavaScript file(s) defining the functionality
  • Icon for the extension and other images as required by the extension

Preliminaries

To start, I create a new directory extension/ to maintain the files required by the extension. This folder shall contain the following:

extension/
├── assets/ # Folder to contain icons and other images
│ └── icons/ # Folder to contain all the icons
│ └── icon.png
├── popup/
│ ├── popup.html # Code for the popup UI
│ └── popup.js # Code for logic layer in the popup
└── manifest.json # App manifest file

Setting up the manifest

Development of extensions usually starts by creating a manifest.json file that contains all the configurations of the extension. A basic manifest file contains the name, description, version (i.e., the version of the extension), and manifest_version.

  • permissions: Specify the optional permissions based on the chrome APIs necessary to implement the extension’s functionality. For Janus AI, the chrome.tabs and chrome.scripting APIs are required.
  • host_permissions: Because our extension should be able to identify selected text from any active tab in the browser, it should discover all hosts at runtime. Thus, "https://*/" should be included in the host permissions.
  • action: This is used to control the behavior of the extension in the Edge toolbar. Within this, default_title specifies what is shown in the tooltip, default_icon specifies the path to the icon to be displayed on the Edge toolbar, and default_popup specifies the HTML file defining the UI of the popup that appears on clicking the extension.
  • icons: These specify the paths to various sizes of the extension’s icon.

Creating the user interface

In the manifest file, the default_popup in action is set to popup/popup.html, which defines what the UI of the popup for Janus AI looks like. Below is an outline of what the UI looks like and how it behaves.

  • An input box (or textarea) for the user to input the text to be paraphrased (either by typing or by selecting text on a tab and clicking the extension)
  • A “Rephrase” button, which when clicked, sends the input to the API server for paraphrasing

Leveraging Chrome APIs to select text in the active tab

The first of the two broad categories that the functionality of Janus AI is divided into involves obtaining the text selected by the user in the browser’s active tab and populating the input textarea. This is achieved by adding the following code in the extension/popup/popup.js file:

Adding functionality to interact with the API server

The second category of functionality implemented in the popup.js file involves interacting with the API server to fetch the paraphrased text and synonyms and display the results to the user. The following code implements this functionality:

Using the extension on Microsoft Edge

To load the extension into Microsoft Edge, go to the Manage extensions menu using edge://extensions/, enable the Developer mode, and select the extension/ folder after clicking on Load Unpacked.

Working demo of Janus AI extension on Microsoft Edge

Distribution of the extension for public use

Once the extension has been developed and tested, it is ready for distribution. Before this, however, the server must be hosted online. Although it’s possible to use platforms like Heroku or set up an Azure Compute instance to host the server, I explore a more scalable and hassle-free approach that uses Docker and Azure Container Service instances.

Containerizing and hosting the server using Azure Container Service

# Log into Azure CLI
az login
# Create a resource group (calling it 'janus')
az group create --name janus --location eastus
# Create a container registry (calling it 'janus_cr')
az acr create --resource-group janus --name janus_cr --sku Basic
# Log into the registry
az acr login --name janus_cr
# Tag the image with the name of the login server of the registry
docker tag janus_ai <login-server>/janus_ai
# Push the image to the registry
docker push <login-server>/janus_ai
# Create a container instance
az container --resource-group janus --name janus_ai --image <login-server>/janus_ai --dns-name-label janus_ai --ports 8000

Publish the extension on Microsoft Edge Add-ons website

Having deployed the server, the extension can be published on the Microsoft Edge Add-ons website to increase its reach and make it available to other Microsoft Edge users. The detailed steps to do so are described in this documentation.

Concluding remarks

In this article, I have illustrated how to develop a text paraphrasing extension for Microsoft Edge that is powered by state-of-the-art NLP models. I start by creating the API server with the modules for paraphrasing, keyword extraction, and synonym finding, followed by building the UI and functionality of the extension from scratch. Toward the end, I also discuss how to containerize the server to deploy it using Azure Container Services and publish the extension on the Microsoft Edge Add-ons website.

References

  1. How To Paraphrase Text Using PEGASUS Transformer
  2. KeyBERT Documentation
  3. NLTK :: Sample usage for wordnet
  4. Overview of Microsoft Edge extensions — Microsoft Edge Development | Microsoft Docs
  5. Extensions — Chrome Developers
  6. Quickstart — Build a container image on-demand in Azure — Azure Container Registry | Microsoft Docs
  7. Quickstart — Deploy Docker container to container instance — Azure CLI — Azure Container Instances | Microsoft Docs

--

--

Lessons learned in the practice of data science at Microsoft.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Tezan Sahu

Data & Applied Scientist at Microsoft | B. Tech from IIT Bombay | GSoC’20 with PEcAn Project