A closer look at BigCommerce customer downtime

As an insurtech startup that has developed the technology to better model and underwrite risk to online revenue interruption, we collect *a lot* of data on ecommerce downtime. Billions of records that, when aggregated, highlight service disruptions and outages.

This data — although valuable as the means to train our machine learning models — typically remains in the background to our customers and partners as we either present quotes or discuss a model’s performance metrics from an insurance perspective.

Recently though we were asked: If a company is on [an ecommerce platform like] Shopify or BigCommerce, do they have to worry about downtime?

The short answer is yes and creates an excellent opportunity for us to explore our data in more depth.

BigCommerce from a systems availability perspective

  1. BigCommerce has over 60,000 customers
  2. It employs 690 people full-time (190 in research and development)
  3. The majority of their technology is built on Google Cloud; ancillary functions are hosted with AWS
  4. There are two main ways to use their technology. The first being directly via Native Storefronts which is a cloud-based subscription service that offers a theming framework with over 100 fully-customizable theme templates and variations
  5. Headless Storefronts being the second, which are open APIs that enable storefront development on leading CMSs like Wordpress and Drupal
  6. Its Apps Marketplace has over 600 apps and integrations that use these APIs

From a systems availability perspective, consider the idea of risk being the unexpected consequences of an organization of people, process, technology, partnerships, and adversarial forces moving through time.

Now consider the idea of stacking a few of these types of organizations on top of each other like how BigCommerce builds on Google Cloud: A fair estimation then for its availability would be between 99.9% and 99.95%; between an average of 4.38 and 8.77 hours of downtime a year over a very long period of time.

But BigCommerce customers also represent their own organizations, and often their technology footprint is not exclusive to their ecommerce platform provider.

More risk gets stacked.

What our data shows

Over a three-month period, we found that on average 90.8% of these sites experienced zero downtime.

Of the sites that did,

  • 9.2% saw an outage of over 1 hour,
  • 4.4% over 4 hours,
  • and 3.3% over 8 hours
BigCommerce customer downtime exceeding the thresholds of 1, 4, and 8 hours measured by the availability of each’s default website — e.g. https://www.company.com

Using the one-hour threshold as a base, the overall average availability for these BigCommerce customers can be calculated as 99.8%; 1.2 hours per month or 14.5 hours per year over a very long period of time.

For the latter group where downtime exceeded the threshold of 8 hours, the mean downtime per-site was 28.8 hours and outliers extended past 168 hours or 7 days. This represents an average availability of 96%.

BigCommerce customer downtime exceeding the threshold of 8 hours

Customer complexity

The basic mechanism of our risk modeling is to use machine learning to continuously best-fit these features to observed downtime. On new data, we can then use this model to predict the probability of significant downtime.

A snapshot of this process offers a view of these sites in terms of complexity proxied by the features we’ve either detected or derived that proved useful to our model.

To summarize these features of a current model in our subset of BigCommerce customers:

  • Many relate to the recent addition or removal of known javascript code — like libraries and widgets. This makes sense on an ecommerce platform where storefronts are easily customized
  • Many relate to CSS framework changes
  • Many relate to DNS changes
  • Many relate to digital certificates or TLS changes
  • Many relate to technologies unrelated to BigCommerce like web application frameworks, web servers, and additional hosting providers

Also calculated:

  • 16.3% of sampled customers are not exclusive when it comes to using BigCommerce as their hosting provider
  • 20.1% are not exclusive when it comes to DNS name servers
  • 65.5% are not exclusive when it comes to email hosting
Features include those collected, derived, and also those that represent a behavioral change over time — e.g. updating DNS, adding or removing javascript

Summary

A reasonable expectation is three-nines or 99.9% availability; one 9-hour outage per year on average or several smaller ones clustered randomly.

We’ve seen much worse, however.

Fortunately, many ways to mitigate these outages are very much within a company’s control including building on stable, well-supported platforms (like BigCommerce) and balancing the use of modern technology with the accessibility of the talent needed to maintain it.