How We 20x Our Chrome Extension Uninstalls

Davis Gay
Rate Engineering
Published in
7 min readMay 5, 2019

No, that is not a typo. We really screwed up.

What happened?

On 26 April 2019 8.31am, we picked up on a user report which mentioned that our Chrome extension opened a tab to the demo page.

What our demo page look like

Turns out our extension version 2.1.0.10 update, which included a small change to open the demo page on install, had a bug where the demo page is opened every time the extension background.js run.

For those who are unfamiliar with Chrome extensions, that means whenever the extension is initialized, like opening Chrome or waking up from sleep and switching to Chrome.

This incident caused 296 uninstalls in within 24 hours. Our daily uninstalls usually ranges between 10–20. Although the bug is simple and was promptly fixed (more below), it caused a huge spike in uninstallation. We even received death threats among words of encouragement.

Bugs in production deployments are not uncommon, but we feel that there are a few good lessons here worth documenting and sharing.

Timeline

We set Thursday as our deployment day (a lesson learnt from bad Friday deployments and having poorer response time over the weekend). We were rushing other features into the release and hence there was a late deployment.

25 April 2019
20:36 SGT: Our engineers submitted extension version 2.1.0.10 to Chrome Store.

21:53 SGT: First uninstallation form which described the issue in a calm fashion, “It opened a shopping page by itself under the pretense [sic] of showing me how it works, while i had already used it before.”

23:46 SGT: First direct report from a user support channel, “the extension opens a tab to demo out of nowhere.”

26 April 2019
08:31 SGT: First eyes on customer reported issue.

09:00 SGT: Issue escalated and raised to Engineering team.

09:11 SGT: Issue confirmed by an engineer on another team that the issue is occurring for all users with 2.1.0.10. Issue escalated to highest priority.

10:19 SGT: PR to fix the bug was made by engineer who made the initial change. Extension version 2.1.0.11 is submitted to Chrome Store. (We start work at 10:00)

11:57 SGT: We sent an email to all our users detailing the issue and that a patch is already under review. Immediate fix is provided (disabling the extension unfortunately)

12:57 SGT: Version 2.1.0.11 is published.

Why did it happen?

  1. Why do we want to open the demo tab on install?

The demo is an interactive step-by-step guide on the core feature flow of our extension. The change to open the demo tab on installation was suggested because not many people were viewing the demo. A user can open the demo from two places — a button on a page that opens right after installation and on “Accounts” tab within the extension itself.

Not many clicked on our tutorial button right after installation
“Try Demo” on the Account tab was not frequented too :(

2. Why did the tab open on extension initialization?

Unfortunately this was an unintended bug that was not caught in our development and deployment process, which caused the tab to be open every time the extension is initialized.

To give some context, below is an excerpt of the erroneous change:

class DemoInitializer {
public static initialize() {
// old
this.setUpInstallListenerToIncludeLinkToDemoOnInstallPage()
// new
this.openLazadaDemoPage()
}

...
}

The initial function of placing the demo button to the post-installation page is replaced with a function to open the demo page.

3. Why was the bug not caught before it was deployed?

Since the code change is pretty small, it was easily to overlook the underlying problem that DemoInitializer is called on extension initialization.

The author assumed that the listener to place the demo link is called once right after installation. In fact it is called on initialization to listen for the post-installation page which is only opened by adding a listener to onInstalled.

The code of setUpInstallListener…() was hidden from the reviewer since there was no change to the code within the function.

Testing and QA was done but only the positive case was tested. Re-opening Chrome as a negative case was not thought of.

4. Why was the bug not detected on production earlier?

Chrome Store review usually takes about 1–2 hours before the new version is live and available to users. Chrome extensions are updated automatically (typically within 24 hours with active Chrome usage).

Extension team did not keep a lookout for the new version to be live and do a force update to test out the new features. This is given that the deployment was done beyond office hours at about 8pm.

5. Why was the deployment done beyond beyond official hours?

There was a rush to deploy on Thursday and the new version did not include complex changes. Team is also aware to avoid deployment on Fridays (from previous deployment lessons)

6. Why were uninstallation increasing even after the fix is live on Chrome Store?

Unfortunately many users received 2.1.0.10 on Friday morning (start of work means booting up computers and opening Chrome). Since the browser recently updated the extension, it is reasonable to assume that the browser took longer for these affected users to look up for a new version.

Uninstallation spikes died down on Friday evening.

What was good

Quick reaction measures after (late) detection
Thursday deployment means the whole product team was around the next day to mitigate the impact of the crisis. A hotfix was submitted promptly (but still at the mercy of Chrome Store review time). Unfortunately because of the autoupdate mechanism mentioned above, the patch is delayed to people who need it the most. At least subsequent autoupdates were fetching the patched version of the extension

Teamwork
There was no finger-pointing. The entire team was focused on mitigating the impact of the simple by deadly bug. No engineer eagerly hopped on this crisis to prove himself/herself. They all knew the engineer who made the error was the best person to fix the issue at near 0 risk of another initialization-related bug.

Customer satisfaction team faced the brunt of customer complaints and they handled frustrated customers without a single complain.

Marketing team stood in and helped to craft a cute PSA email to assure our users that a fix is on the way and to seek their patience and kind understanding.

Analytics team crunched the data and provided a silver lining — most of the uninstallations were not from active users.

Uninstallation forms worked
From this incident, we received way more description comments in addition to the usual checkboxes of uninstallation reasons. This means this bug was (rightly) so frustrating that they complained about the issue in text. Of course there was a spectrum of emotions conveyed with their comments. Perhaps prior to this incident, we did not anger our users enough to warrant them to type out their complaints.

What can be improved

A lower friction reporting channel
We received more user feedback from uninstallation forms than our official support channels. We should work to hear from users before they uninstall our product.

Be even less intrusive than we already are
This incident is a strong datapoint on how users dislike intrusive Chrome extensions (or any applications in general). We always strived to design our extension such that it appears at the right time when user need us. However, that can be subjective sometimes. Through this incident, we should be stricter on our assumptions and be even more user-centric.

Emphasize on principles before rules
Rules like “no Friday deployments” are meant to ensure that the impact of a bad deployments are mitigated. However when rules are taken purely without context, they become counter-productive as seen in this incident. Sure, Thursday evening deployments are still better than Friday evening deployments but arguably a Friday morning deployment is better than the former.

Ultimately the what is important is to have a manageable deployment with high risk mitigation. It does not matter how that is achieved.

Principles before rules extend beyond deployments and software engineering. Principles motivate people based on beliefs, goals and hope. Rule motivates through fear and punishment. Principles are flexible, whereas rules are rigid. However, this is not to say we should avoid rules altogether. Rules controls, principles guides. There are no silver bullets here — there are times where one triumphs the other.

Any thoughts you would like to share? Think there are additional measures we can take? Or perhaps you have your own deployment horror stories?

We will be happy to hear from you in the comments!

--

--