Learn how to make your first Chrome extension — Part 1

Jackson Prince
Sep 19 · 16 min read

This two-part series is written for those looking to make something cool with a browser extension. Specifically, anyone looking to catch Pokemon while surfing the web. This piece is part 1 and here is part 2.

So, everyone.

In all likelihood, the only thing you employ browser extensions for is swatting ads. Maybe you have Sudoku or Honey the money-saver, but you almost certainly have AdBlock – the pre-eminent powerhouse of better browsing.

My hope is that by the end of these two articles, you are inspired to create something equally ubiquitous and powerful.

In this first article, we’ll work through the fundamentals of a browser extension, then build a small project that allows us to turn all images on the page into Pikachu.

In part 2 – Catch ’Em All Chrome Extension – we move on to our main project: serving up catchable Pokemon with each new web page we visit.

I’ll write my code as a Google Chrome extension, but know that all browser extensions share a common language.

Article outline:

  1. What is an extension?
  2. Why an extension?
  3. How to build an extension?
  4. Our first extension (replace everything with Pikachu).

If all goes according to plan, this is where we’ll be by the end of this article:

First, the fundamentals…


What Is an Extension?

An added skill

Your web browser is the most powerful piece of software on your computer. It knows more than anything else you own, containing vast yottabytes of incredibly specific data within a click’s reach.

And yet, it’s humble enough to realize that it doesn’t know everything.

It doesn’t know that you prefer your cursor to be pizza. Neither does it know how to save you great amounts of money while you shop online. Nor does it inherently bat away the ads that companies pay your browser to show (that would constitute a suable conflict of interest).

Instead, these needle-point, specific skills are made available as “extensions.”

But what’s that?

It’s just code

Like all things on the web, extensions are just code. You’re adding a few lines of functionality to your browser to enhance its abilities while you surf around from URL to URL.

Maybe the code you write changes everything on the page to kittens. Maybe it detects ads and thwarts their entrance into the DOM. Maybe it turns your browser nav bar to the color of the sky according to your approximate geolocation (don’t know if that exists, but would be cool and very possible).

Pretty much anything is possible in an extension, because, at its core, it’s just JavaScript, HTML, and CSS, with all the power that entails.


Why Is It a Thing?

Extensions are in everybody’s best interest.

Why our best interest?

They’re in our best interest, because we can personalize and enhance our experience of the web.

We can dictate visual appearance, play games, instantly access specific web services (e.g. Google Translate), look words up by clicking on them (Dictionary)… all without ever leaving the current page.

That’s awesome.

It’s exceptionally awesome as a developer, because I can write my own desired browsing experience into existence, then test its merit by making it publicly available. To boot, they’re by and large completely free. No cost. All benefit.

Why their best interest?

At the same time, they’re in Google’s (let’s say) best interest, because we, the users, take more personal ownership over our web experience and, as a result, enjoy Google’s product that much more. Our happiness is their happiness.

Not to mention that open-source mentality, which is fundamental to the spirit and perseverance of the web, leads to greater innovation and a better overall product. Google would never come remotely close to internally generating every feature currently available in the chrome web store.

Extensions also serve the browser as a whole by eliminating code which is not otherwise absolutely necessary.

By providing a library of “additional browsing options”, browsers lighten their baseline computational load, while still providing the desired services for those who want or need them.

Efficient.


How To Code Extensions?

Extensions command an unnecessary aura of intimidating complexity. As if their code is somehow “different”. In truth, if you’ve made a webpage using JavaScript, HTML, and CSS, then you’ve won 90% of the battle.

There are only three foreign concepts to take special note of. Embed these in your brain and you’re good:

1. You’re working with three isolated webpages

I can’t emphasize this fact enough. My first eureka! moment building a Chrome extension was when I realized I was literally working with three separate webpages that know nothing about one another unless we instruct them to communicate.

Here are those pages for you now, the first of which you’re intimately familiar with already:

  • Our webpage:
Build a Chrome Extension, Picture of Blank Google Webpage
Build a Chrome Extension, Picture of Blank Google Webpage

Above is a regular old webpage. It could just as easily have been Amazon or Medium or Reddit.

  • Our popup:

That AdBlock popup you see is a whole webpage unto itself. It’s not part of the Google homepage. It knows nothing about B.B. King. Nor is it part of any other webpage. It is its own entity, containing HTML, CSS, and JavaScript, functioning in total independence.

In the same way that your browser loads the appropriate HTML, CSS, and JavaScript when you arrive at a new URL, the same occurs when you click the AdBlock icon and see a small page appear. Your browser loads HTML, CSS and JavaScript into that popup screen.

The difference is, instead of going and finding those files over a network request, your browser just carries those files with it at all times. When you load a new Chrome extension into your browser, that code is always and immediately available.

Note: AdBlock is, of course, aware of the current webpage, otherwise it wouldn’t be able to do its job. But it’s only aware of the webpage and B.B. King because there are JavaScript files instructing the two pages to communicate.

They’re not inherently linked. We’ll gain a deep understanding of this in time. For now, think about their interaction with one another as shouting over a fence, rather than sharing a home.

And our final page, the workhorse and my favorite of the three:

  • Our background page/script:

For which there is no picture, because it runs invisibly in the background.

We use this page to monitor events related to the browser itself like, for example, listening for when our extension’s icon is pressed or retrieving and storing information until the popup calls for it.

(Hint: We will use the background script to retrieve Pokemon.)

If you can wrap your head around the fact that there are three independent web pages at work in an extension, you’re nearly golden. You’re not required to use all three. If you only need one of these pages to get the job done, then hell yes. Use the one.

Over the course of this series, we will learn and work with all three. Which means we will be working in three separate developer consoles!

There is no better way to demonstrate the idea that these are three independent webpages than to bring up all three, independently run consoles. Here’s an in-action still-frame:

You’re looking at three different consoles, each one completely unaware of the other two. Left deals with background, right with popup, bottom with web page.

If this strikes you as odd and chaotic, be comforted by the fact that it does for me, too. But, good news is: you just made it through the hardest part.

The rest is gravy.

2. You direct conversation between your extension’s webpages

Again, the three pages of your extension (web, popup, background) are independently owned and operated. To get them communicating, you’ll use the Chrome API.

Let’s just ignore the fact that API is the worst acronym in all of programming (what can “Application Programming Interface” not mean?).

Think of the phrase Chrome API as: “functions Chrome provides to people who make extensions”. Or Chrome-Speak, which is how I’ll refer to it for the remainder of this article.

Here’s an example of some Chrome-Speak:

chrome.tabs.sendMessage()

Translation:

Within chrome’s list of functions for extensions to use, let’s grab the one called tabs, pertaining to the current window’s tabs. Now, from the tabs root function, let’s use the one that “sends” a “message”.

What this function illustrates is that we have to go out of our way to send and receive information between our three pages. We will use this exact function later on to send a message from our background to our webpage.

The best tool you have to learn Chrome-Speak is Chrome’s documentation, which is good not great. But it gets the job done.

3. You need a manifest

The last and final oddity of extension building is you need something called a manifest. Manifest is another one of those words which is twice as long and obscure as it needs to be. And extension manifest is a list.

The list says two things:

  • Here are the key files in my extension.
  • Here are some default behaviors (i.e. “I want a popup page” or “I don’t want a popup page”).

That’s it.

Your browser requires that you include a manifest in your extension so it can better organize its thoughts and position itself to receive and implement your code.

Remember, your browser did not plan for whatever code you write in your extension. Therefore, the only way for your browser to play nice with your code is for you to tell your browser what’s in your code, and how you expect your browser to handle it.

In this example manifest, we’re telling our browser the absolute minimum: name and version numbers.

{
"name": "Everything Pikachu",
"version": "1.0",
"manifest_version": 2
}

Note: it must be written as “manifest_version”: 2. No idea why. Just go with it.

As far as the browser knows, based on this manifest, our entire extension is just a name and a few version numbers. No JavaScript. No HTML. No CSS.

Regardless of how many wonderful scripts may actually be present in your extension’s folder, none of them will appear or function without first telling your browser of their existence.

Let’s do that now:

{
"name": "Everything Pikachu",
"version": "1.0",
"manifest_version": 2,
"browser_action": {
"default_popup": "popup.html"
}
}

Nice. Your browser now knows a few important bits of information it was previously without.

It knows about your extension, called “Everything Pikachu”, that it’s the first version of this extension; and when you click the extension’s icon (which your browser considers a “browser_action”, because it’s an action occurring directly in the browser), then the page in your extension’s file named “popup.html” will appear.

Fairly self-explanatory. So long as we let our browser know what’s going on, it will comply.

In terms of important overarching concepts to review, that’s it. We’re now in a good position to start building.


Our First Extension (Everything Pikachu)

Goal: Click a Pokeball icon in the corner of our screen and immediately see the current web page’s images substituted with Pikachu.

No popup (we’ll create a popup in the next article). All of our work will consist of communication between the background script and web page.

Note: I assume you have basic familiarity with JavaScript. Still fine if not.

Create and load extension

Begin by creating a new directory on your computer with the desired name of your extension and navigating inside of it so we can start making new files.

$ mkdir everythingPikachu
$ cd everythingPikachu

Remember the manifest I mentioned us needing above? Let’s create that document first, and add the first few lines from above.

$ touch manifest.json
$ atom .
// Within manifest.json =>{
"name": "Everything Pikachu",
"version": "1.0",
"manifest_version": 2
}

Note, again: Chrome cannot load our extension if the manifest version value is anything other than 2.

You should see this setup:

Guess what? That’s an extension. An empty one, but an extension nonetheless. Let’s load it into the system.

Navigate to chrome://extensions in your URL…

…make sure Developer mode (top-right corner) is turned on…

…click load unpacked and select your everythingPikachu directory. If you complete those actions, you should see your new extension has loaded:

We’ll return to this page every time we make changes to our manifest and scripts.

One unfortunate necessity of building Chrome extensions is that you need to constantly re-upload the extension into the browser (with the exception of popup’s CSS and HTML). The good thing is it only takes a second.

Add an icon

Return to the everythingPikachu/ directory. Let’s add an icon to our currently boring “E” extension (see newly appended gray icon in your browser toolbar).

Search for a good .png you like online (of Pikachu or a Pokeball) and make sure its dimensions are square. All the better if you can find a transparent background. I saved this guy.

Your file structure should look like this:

Remember, we need to tell the manifest all the key information. Let’s let it know we want this Pokeball to be our icon…

…return to chrome://extensions and press the reload button. You should now see this in the corner of your screen:

See that Pokeball? You’re up and running. Click it. Nothing happens. But that little ripple effect lets you know how ready Chrome is to make some magic.

Display Pikachu everywhere

Up until this point, I’ve alluded to there being three pages, one of which is the webpage. That’s true. In Chrome-Speak, we refer to this page as the Content. Webpages hold content with which we engage when we browse the web. So that makes sense.

If we want to access that content, we create what is called a content_script. In fact, you can create many different scripts that operate on a webpage’s content. So long as you declare them all as content_scripts, you’re good.

Let’s create and declare one now.

From our everythingPikachu/ directory…

$ touch content.js// Within content.js => console.log('content script is running')

If we tell Chrome the right information, we should see content script is running in our web page developer console anytime we go to a new web page. That is, anytime we engage with new content.

Let’s make those assertions in the manifest now:

Translated: Chrome, here are the content_scripts we want to run when we load “any” new webpage (i.e. “all urls”). The first one written in “JavaScript” is called “content.js”. Find it for me. It’s in this directory.

Make sure you click reload (bottom right) in your management page:

Test the new script by opening a new tab or refreshing a current tab. Open your developer console. You should see content script is running logged…

Now that we know our extension has access to our web pages, let’s start manipulating them by changing all available images to this one of Pikachu exploding in a rage of electricity.

  • content.js updated…

We are iterating through the web page’s collection of images and changing their source.

Reload the extension in your extensions manager page, then open a new webpage. The New York Times is a particularly fun example.

Nice! You just successfully created an extension that manipulates images on a webpage.

Now, let’s practice communicating between our “content script” (script associated with the web page) and the “background script”, which is constantly running behind the scenes, listening for browser actions.

Background script

Guess where we’re going before we do anything? Back to our all-knowing manifest.

We must first let Chrome know we’re adding a background script before trying to add a new document to our file:

Check that bottom entry “background”… it’s pretty self-explanatory. We’re adding to “background” a “script” by the name of “background.js”.

Before you reload the extension in your management console, let’s create a background.js that logs to the console…

$ touch background.js // Within everythingPikachu => console.log('background script is running')

…which should look like this in the file tree…

Make sure to reload your extension. Changes to the background script require a reload.

Refresh your current webpage or load a new one, then open up the developer console… you should see…

Nothing!

Remember, the background script and background page operate silently in the background, independent of the content script and web page.

If we want to access the background script’s console log, we need to inspect the background page, an option made available to us at chrome://extensions:

Click background page and you should see a window popup. That window is your dedicated background script developer console. Within its console are the words background script is running:

Note: If you reload or visit new web pages, background script is running will not be re-logged to the console.

A background script operates for as long as Chrome is open. So long as Chrome is active and the extension is activated, its background script will run without pause, constantly listening for actions taking place.

Let’s test that the background script is listening for events by console logging “pokeball icon was pressed” whenever we click the extension icon. This will require some of that beloved Chrome-Speak:

Translate: “chrome,” you have a bunch of “browserActions” that can take place, one of them is the event of me “clicking” the icon… “listen” for when that happens. When it does, log ‘pokeball icon was pressed’ to the console.

Remember, this script pertains to the invisible background page, so all console logs will appear in the background console we re-discovered just moments ago.

Now, if we click the icon at the top of the screen 18 times, we should see the same message appear 18 times in the background’s console:

We’re close.

Our goal was to be able to trigger a Pikachu image cascade by clicking the Pokeball icon. We’ve done both of those things independent of one another. Now, we combine them into a series of calls and responses.

Flow of communication:

  1. Background script will detect that the icon has been pressed.
  2. When pressed, it will alert the content script.
  3. Content script will receive the message.
  4. If the message is correct, content script will then transform all imagery.

Putting It All Together

We already accomplished #1. Let’s work on #2.

Instead of console logging text when we press the icon, let’s tell the background script to send a message to the current tab (using the same Chrome-Speak we demonstrated in the section You direct conversation between your extension’s webpages).

Ln 1 translate:

When the icon is clicked, run the function tellContentScript, which takes in an argument of the current “tab” (the current tab is retrieved and passed implicitly by Chrome).

Ln 3+ translate: “chrome” get me all the “tabs” and “send a message” to the current tab (we know which one that is, because the current tab object is being implicitly passed through the tellContentScript function).

Note: sendMessage takes two arguments by default: an ID and a message in the form of an object.

At this point, every time we click the icon a message will be sent, but nothing will receive it. We need to establish both sides of our communication.

Return to our content.js file and prepare Chrome to receive message from the background script.

Translate: “chrome” while you’re “running” this content script, “listen” for a “message” from the background script. When you receive it, run the function gotMessage.

Note: gotMessage (or whatever you decide to name that function) takes in three arguments by default.

The first is the message received from the background script, the second is the identity of the sender, and the third is a callback function you can use to throw a response back to the background script, perhaps thanking it for the alert.

Within gotMessage, we check to see if the message received is the one we expect (“send it, dude”) — if it is, go ahead and change all the images to Pikachu.

Before we test our work: reload the extension from our extension manager page.

Once done, you should be able to do this:

Powerful. With this format, we can change anything on the page: text, buttons, videos, etc.

That’s a ton of fun and an extremely small amount of code. We’ll evolve our project in the next article by creating a popup, styling it, and making it interact with the background script to simulate a random Pokemon encounter (using the all-time greatest, PokeAPI):

Catch ’Em All Chrome Extension


Conclusion

If you’re dropping off here, I hope you’ve learned some good new skills and are cooking up some excellent ideas for your own browser extensions.

GitHub

The GitHub repo for this article’s project is available here:

everythingPikachu

Happy Coding,

Jackson

Better Programming

Advice for programmers.

Jackson Prince

Written by

Full Stack Engineer. Film enthusiast. Twitter: @JPrinceDev

Better Programming

Advice for programmers.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade