The State of Ecommerce — 2019 Report

Govind Chandrasekhar
The Ecommerce Intelligencer
7 min readMay 27, 2019


Over the course of two weeks in February 2019, two Semantics3 engineers crawled the entire universe of dotcom domains looking for ecommerce sites. We built our numbers bottom up, surveying over 138.2 million dotcom sites and analyzing ~6 million merchants. This report presents the observations that we gathered. Scroll to the end to access a downloadable PDF and interactive infographic.

Most reports on the state of ecommerce inevitably focus on dynamics associated with the largest players alone, typically built on financial reports from listed companies. In this report, we aim to provide a different perspective of the industry in two key ways:

  • We’ve holistically looked at all stakeholders in the industry, not just a sampled non-random subset, by analyzing all active ecommerce dotcom websites.
  • We’ve explored diverse aspects of the industry, including third-party marketplaces, social media platforms, hosting platforms, promotional channels, product catalogs & categories and technical intricacies.

The size of ecommerce

We discovered 138,293,352 unique dotcom domains across the entire Internet (as of mid Feb 2019). Of these ~21.9% of the domains were invalid/erroneous and 10.5% of the domains redirected to other dotcom domains in the list.

Of the remaining 93,482,546 valid domains, 68% were detected as primarily English sites.

Using our categorization algorithms we identified that 5,964,972 of these domains (~9.4% of all valid English dotcom domains) engage in ecommerce activity.

What are people selling online?

Clothing, Beauty and Baby Products are the most popular categories of products sold online.

Additionally, we discovered that the catalog size of the median seller is very low. 93%of all ecommerce retailers (roughly 5.5 million retailers) have catalog sizes less than 500 products. Only approximately 100 sellers have a catalog size of more than 500,000 products.

It is no surprise that Facebook is clearly the preferred choice of social media platform across the ecommerce universe (more details in the “Social media usage of online sellers” section). What is interesting, however, is the category distribution of Facebook compared with up-and-coming social networks like Instagram and Pinterest.

When platforms are grouped by categories and normalized by their overall market share, we see unique dynamics of category affinity emerging. Platforms specialize in select groups of categories, strongly reflective of the nature of their user bases. For example, Pinterest shows the greatest affinity for Clothing sellers, while Instagram appeals to sellers of Beauty products.

Similarly, taken as a whole, Amazon is the clear choice of marketplace for sellers who wish to crosspost their products on external platforms (more details in “The prevalence of crossposting” section). However, interesting patterns emerge when we explore this through a category-specific lens. We looked at the proportion of listed sites in each category and adjusted it against the category distribution of the average ecommerce domain in our set.

What we found is that the strategy of crossposting on marketplaces is more prevalent in some categories (Beauty and Office) than others. For categories that do show a positive listing preference, we see that Etsy is a clear draw for sites that sell Arts/Crafts and Jewelry products, just as Amazon is for Books and Footwear, and eBay is for Electronics & Vehicles.

The popularity of platforms for selling online

WooCommerce leads the pack by a significant margin in terms of numbers of ecommerce sites that use its platform tools. Shopify is at second place with market share ~20–25% less than that of WooCommerce.

In terms of sheer number of sites, WooCommerce leads Shopify in all categories except Jewelry and Luggage.

Social media usage of online sellers

Facebook is still the preferred choice of social media platform across the ecommerce universe.

We dug deeper to see how Facebook stacks up against its up-and-coming competitors. Specifically, we looked at how unique the website base of each network is. Almost counterintuitively, we found that of all the platforms, Instagram (owned by Facebook) had the least usage overlap with Facebook. A full 14% [7.3% ÷ (7.3% + 43.3%)] of sites that use Instagram did not have a Facebook page, a higher proportion than that of Pinterest (7.5%) and even Twitter (9.2%). Between the up-and-comers, user base is a bit more distinct; a full 37.5% of Pinterest’s user base does not detectably promote on Instagram.

The prevalence of crossposting

Crossposting refers to tendency of ecommerce sites/businesses to also list their products on external marketplace platforms such as Amazon and Etsy. Amazon is, as expected, the preferred choice of marketplace for crossposting, though Etsy is not too far behind.

Additionally, we found that crossposting between multiple marketplaces is minimal, averaging at roughly 2%. This shows that few ecommerce sites are willing to take on the hassle of dealing with multiple marketplaces. This could be due to logistical challenges, fear of dilution of goodwill (reviews & ratings split across platforms) or just the sheer headache of actively maintaining multiple catalog databases.

Under the hood

In this study, we also took a close look at the technical aspects of these ecommerce businesses.

On average, ecommerce sites are more secure than the average dotcom site; ~27.5%of all dotcom domains use HTTPS vs. ~48.2% for ecommerce domains. While this seems encouraging at first glance, given the relatively more sensitive nature of information being routed through ecommerce sites (payment and identity information) as compared to that of an average dotcom site, the level of HTTPS adoption is sub optimal, especially given the novel data privacy and security related issues of the recent past. One caveat to note here: this analysis was run on the homepage of these sites, and doesn’t account for sites that use external payment gateways.

Shopify sites are almost 100% encrypted. This is likely because Shopify allows its clients to encrypt its stores at no additional cost.

Additionally, all but one platforms has a median response time of ~1 second, with Shopify being the fastest. Interestingly, WooCommerce sites show a response time of ~3 seconds, although WooCommerce has the largest install base. This disparity compared with Shopify could be reflective of the fact that WooCommerce allows open-source DIY installations, unlike Shopify which primarily provides hosted solutions. Also notable is the fact that ~25% of WooCommerce sites were detected to be outside of North America (compared to <1% for Shopify), though this justification doesn’t stand-up to the fact that Magento (30% outside N. America) and BigCommerce (45%outside N. America) deal with similar challenges.

Semantic web standards allow for easy sharing of data between platforms: think of the blurb that’s auto-generated when you paste a link on Facebook or Twitter. Given the value of adopting these standards, and benchmarked against dreams of a fully Semantic Web, these levels of adoption are underwhelming.


The data for this report was gathered primarily by crawling all of the websites in the dotcom domain space over a period of two weeks in February 2019. We crunched over 2.5 billion data points using the systems that power our core products at Semantics3, primarily our Crawl, Categorization and NER APIs.

This report comes with a few key caveats. First, all websites were treated as equal, regardless of their revenue. Second, we’ve defined ecommerce websites as any website which permits or motivates online transactions. Third, we’ve only looked at primary domains and not subdomains. Fourth, all of the crawls were run on servers in North America.

For any questions, email us at

About Semantics3

Every ecommerce focused company has to re-invent the product data science stack; Semantics3 provides this entire stack through a suite of Data + AI APIs.


Download a PDF version of this report

Interactive infographic:

This article was originally published on the Semantics3 Blog