How I built a Chrome extension with Plasmo

Published in

SLTC — Sean Learns To Code

7 min readJun 2, 2023

Once upon a time in the US 🇺🇸, when the job market for tech workers was still hot 🔥, I was interviewed by a company whose name I will not say. The interview experience was great. I was very excited to learn from the recruiter that I cleared all the rounds and that the feedback was positive 🙏.

A week later I was told that the headcount had been revoked due to the changing macroeconomic environment 😢.

One good thing from the interview process is that the coding challenge: was pretty fun: given a piece of HTML that was taken from an article (think a WaPo article), write some JavaScript / TypeScript code to disemvowel all the textual content.

So what is disemvoweling? This is taken from Wikipedia

Disemvoweling, disemvowelling (see doubled L), or disemvowelment of a piece of alphabetic text is rewriting it with all the vowel letters elided.[1] Disemvoweling is often used in band and company names. It used to be a common feature of SMS language where space was costly.

For example, the sentence The quick brown fox jumps over the lazy dog, after being disemvowelled, becomes Th qck brwn fx jmps vr th lz dg .

In this coding challenge I was given a specific piece of HTML . One thing that I thought to myself after the interview was that it would be fun to build a Chrome extension that would do this to the content of any page.

After months of fighting procrastination 🐢, I’ve finally managed to finish this little project. The source code is available on Github. The extension is named dsmvwl😉. I have also submitted the extension to the Chrome Web Store and the submission is currently pending review.

Below are a couple of nice things that I learned from building this extension.

Plasmo

Procrastination aside, one of the main reasons it took me quite some time to finish this project was that I had no experience creating a Chrome extension from scratch. I had to spend some time going through the documentation of how to build a Chrome extension. By the time I finished all the tutorials my impression with building extensions was that it contained too much boilerplate for configuration and that it took away a significant part of the joy of working on the meaty parts of whatever I wanted to build.

Because I didn’t want to go through the process of creating every single file needed for building an extension from scratch, I started searching for frameworks or libraries that could help with soothing the pain of the boilerplate. That’s when I found Plasmo!

Plasmo helps you build, test, and deploy powerful, cutting-edge products on top of the web browser.

If you think that working with extension boilerplate is a pain in whatever part of your body then Plasmo is definitely the mother of all painkillers.

Why Plasmo?

Here are a few reasons why I chose Plasmo and also why I would recommend other people to use Plasmo for their extensions

Boilerplate reduction: the starter code is generated for use by using the Plasmo CLI. This saves a lot of time kick starting a new extension project.
Cross browser support: Plasmo abstracts away to nitty-gritty details of building an extension for different browsers (Chrome, Firefox, and heck even Microsoft stuffs 😅). Even if you mostly just use Chrome (like me), this is still a very nice feature
Third-party support: Plasmo comes with TypeScript by default and also allows users to generate starter code with baked-in integration for other essential frontend development libraries such as Jest, React, or Redux. Want to build your browser using Svelte? That’s also supported, too.
Documentation: most of the features you need for your extension are very well documented. One of the greatest things of Plasmo’s documentation is the availability of examples that can be found here.

How to start with Plasmo?

It’s just the matter of running a couple of commands in your terminal. I would recommend checking out the documentation page at https://docs.plasmo.com/framework.

Overall architecture

With all the setup quickly done thanks to Plasmo, the next step is to figure out the overall architecture of the extension and the dependencies that will be needed.

Popup

The popup is the piece of UI that shows up when a user clicks on the extension icon.

For this project we needed a very simple UI with some buttons for users to click on. I built the popup UI with React and MUI components. The code of the popup looks like this

import { Button, Stack } from "@mui/material";
import { useState } from "react";

function IndexPopup() {
  const [disemvoweled, setDisemvoweled] = useState(false);

  async function handleClick(command) {
    // TODO: handle the click event
  }

  return (
    <Stack direction='row' gap={2}>
      <Button disabled={disemvoweled} variant="contained" color="primary" onClick={() => handleClick("disemvowel")}>Disemvowel</Button>
      <Button disabled={!disemvoweled} variant="contained" color="secondary" onClick={() => handleClick("reset")}>Reset</Button>
    </Stack>
  )
}

export default IndexPopup

At startup, the popup only has the Disemvowel button enabled. After the page is disemvowelled, the Reset button is enabled. If users reset the page, then Reset is disabled and Disemvowel is enabled again. Either the Disemvowel button or the Reset button is enabled at a time but never both.

Content script

The popup doesn’t have access to the DOM tree of the page on the currently active tab. To be able to access that DOM tree we need a content script. The code in the content script looks like this.

import { useMessage } from "@plasmohq/messaging/hook"
import { useState } from "react";

const Disemvowel = () => {
    useMessage<string, string>(async (req, res) => {
        // TODO: handle the req and send data back to the popup using res.send
    })
}

export default Disemvowel;

Some explanation on useMessage will be provided in the next section on Messaging.

Messaging

Because the popup and the content script live in separate “worlds”, we need a mean of communication between the two worlds. Plasmo provides a Messaging API to handle message passing across different kinds of scripts in an extension.

In this project, we want to use that API to:

From the popup, send a message to the content script
From the content script, receive the message, perform some logic, and send a reply back to the popup

The first task can be done by calling sendToContentScript , a function provided by the Messaging API of Plasmo. The function needs a request object that has at the minimum a name property for the message.

The code for the click handler of the popup’s buttons looks like this with sendToContentScript

import { sendToContentScript } from "@plasmohq/messaging"

function IndexPopup() {
  const [disemvoweled, setDisemvoweled] = useState(false);

  async function handleClick(command) {
    const resp = await sendToContentScript({ name: command });
    if (resp === 'disemvoweled') {
      setDisemvoweled(true);
    } else if (resp === 'reset') {
      setDisemvoweled(false);
    }
  }
  // ...
}

export default IndexPopup

For the second task, the content script can use the useMessage hook, whose first parameter is a callback function that has references to

The message sent from the popup
A response handler that has a send method that can be used to send a reply back to the popup

The main logic

With all the architectural components of the extension created, the only missing pieces of the project is the main logic. There are 2 features that the extension needs to provide

Disemvowelling: remove all the vowels in the textual content of the page
Reset: revert the textual content back to the original state

Disemvowelling

The algorithm for disemvowelling the textual content of a page works as follows:

Traverse the DOM tree of the page starting from document.body
At each node, if the nodeType is TEXT_NODE , remove all the vowels in the node’s textContent
Otherwise, continue the traversal for all the child nodes

The code looks like this in TypeScript

const disemvowel = (node: Node) => {
    if (node.nodeType === Node.TEXT_NODE) {
        node.textContent = removeVowels(node.textContent);
    } else {
        node.childNodes.forEach((child, key) => disemvowel(child));
    }
};

The exact implementation of removeVowels can be found on the Github repository. It’s a fun exercise for the readers to work on as well and if you can’t figure it out I heard there’s some fancy code autocompletion tool out there that can help you 😉.

Reset

This feature was added as I worked on this project and was not a requirement that I had to work on during the interview. I figured it didn’t make sense for users to have to reload the page if they wanted to reset the textual content. Having the extension support reset capability is more convenient thus provides a better user experience.

In order to be able to reset the content, we need to keep a backup of the original content after disemvowelling. I decided to update the code of disemvowel with 2 extra parameters

ancestors : an array of numbers indicating the indexes of each ancestor of a node in that ancestor’s parent’s array of child nodes. This helps uniquely identifying the position of a node in the DOM tree.
lookup : a Map<string, string> that allows us to lookup the original content of a node based on its node id, which is created based on ancestors

The new code of disemvowel looks like this

const disemvowel = (node: Node, ancestors: number[], lookup: Map<string, string>) => {
    if (node.nodeType === Node.TEXT_NODE) {
        const nodeId = ancestors.join(".");
        if (!lookup.has(nodeId)) {
            lookup.set(nodeId, node.textContent);
        }
        node.textContent = removeVowels(node.textContent);
    } else {
        node.childNodes.forEach((child, key) => disemvowel(child, [...ancestors, key], lookup));
    }
};

As the content of a page is being disemvowelled, its original content is cached in the lookup map, which will be passed to the reset method when we want to reset. The code of reset looks like this

const reset = (node: Node, ancestors: number[], lookup: Map<string, string>) => {
    if (node.nodeType === Node.TEXT_NODE) {
        const nodeId = ancestors.join(".");
        node.textContent = lookup.get(nodeId) ?? "";
    } else {
        node.childNodes.forEach((child, key) => reset(child, [...ancestors, key], lookup));
    }
}

Now we can fill in the missing part of the content script code with the following

const Disemvowel = () => {
    const [lookup, setLookup] = useState<Map<string, string>>(new Map<string, string>());
    useMessage<string, string>(async (req, res) => {
        const {name} = req;
        if (name === "disemvowel") {
            disemvowel(document.body, [], lookup);
            setLookup(lookup);
            res.send("disemvoweled");
        } else if (name === "reset") {
            if (lookup !== null) {
                reset(document.body, [], lookup);
                res.send("reset");
            }
        }
    })
    return null;
}

And here we go! Happy disemvowelling!