5 domains of ecommerce Data Strategy

How to power an effective data strategy within the ecommerce space

Following from a series of articles on E-commerce Data, covering Customer Lifetime Value, Personalization, and some data structures, I wanted to cover 5 core domains to an e-commerce data strategy, Pricing, Ads Analytics, CRM, AB testing and Personalization.

There is a vast array of domains where data can add value to an e-commerce company that is not covered within these domains, nevertheless these would be the areas of key focus for most commercially minded companies.

Pricing & Promotions

Pricing is one of the area of key focus for an e-commerce strategy, making sure the right data and insights are uncovered with respect to pricing is essential.

Understanding the impact of pricing on demand is of particular importance to e-commerce companies. E-commerce companies often tend to need to build scale in order to be profitable and lowering prices has a direct effect on conversion rate, increasing the quantity sold and helping achieve further scale.

Having estimates of price demand elasticities for different categories of products help us make the right tradeoffs. Furthermore offering a given amount to lower prices can be more efficient for acquisition than spending a comparable amount on increasing the marketing budget due to the effect of leverage from value added taxes.

Estimates of price demand elasticities can be achieved when using large catalogue by varying the price of items within a common category or by for instance setting up a multi-variate discount experiments, offering for the same products different level of discount to different customers.

Another aspect is looking at cost based pricing, and particularly the split between fixed and variable pricing. Being able to calculate contribution margin is one of the most important factor in E-commerce, where the amortization of fixed cost often happen through achieving high volume. Understanding what should be the price floor for each item by calculation a entitlement value cost estimate… Being able to effectively operate cost based pricing requires to have sufficient data and insight in order to both attribute and allocate the different cost structures elements. Typically a thorough exercise of data collection, validation and maintenance needs to be undertaken in order to obtain accurate estimates.

Competitive pricing, is sometimes one of the most effective strategy when offering commodity products, Customer have been accustomed to look at comparison website to determine if it is worth purchasing from a given merchant or website. Competitor’s data can be obtained by generating API calls to pricing comparison api (such as priceapi.com) or check of competitive prices on Amazon. Another option is to start scraping their g websites with Python libraries such as BeautifulSoup.

Dynamic pricing is as well an area in which data forayed, Uber has its surge pricing, Amazon is notoriously known for changing its prices multiple times a day and has been accused of increasing price based on demand [1].

Dynamics pricing usually involves getting a good grip on the different market conditions such as Supply using indicators such as inventory position and time to resupply as well as Demand Indicator such as specific page views, cab requests etc ..

Overall each of these pricing analyses requires some specific data points and capabilities to be available. From having experimentation capabilities and sales order data in order to test price elasticity, to having full cost structure element estimates to provide cost based pricing, or full catalogue and competitors catalogue & pricing in order to offer competitive pricing to finally having demand and supply estimators to be able to offer dynamic pricing. These capabilities are however complementary and the best pricing decisions are taken when taking each of those into account.

Advertising Analytics

Advertising Analytics helps with the different digital marketing challenges such as how to attribute a sales to a given marketing channel, how to best allocate spend across channels to maximize ROI, how to identify the right audience segments for my propositions ….

Customer Acquisition Cost (CAC / CPA): Andressen Horowitz, the legendary venture capital firm, placed CAC as one of its key business metrics to measure to understand the promise and health of a given company. Companies need to have a good understanding of how their marketing investment translate into a customer acquisition cost, and how these customers acquisition translate into profit (CLTV) in order to make sound investment decision. CAC comes into different flavors, sometimes referring into only the advertising spent used to acquire the customer, my view is that it should take into account the full impact of a customer’s first order. Some other time it is taking into account at a blended vs. paid or as organic vs. inorganic channels, and while these faces attribution challenges, having the view and understanding the limitation of these metrics do help in making more informed decision.

Marketing attribution: There exists a multitude of types of attribution models from single touch point (eg: Last click), to fractional (linear, time decay, position based,…) or probabilistic (trying to provide an estimate of the share of value brought by each touch point). Their aim is to divide the credit of a specific event or action to the different sources having contributed to said action.

One of the main challenge within that domain is in the marketing attribution of sources to the right customers. There are challenges in being able to deduplicate identities and resolve cookies to a single customer. Being able to do this is a requirement for effectively attributing the right credit to each of the source. Difficulties on tying up identities across browsers, or through the different issues faced from using in app browser like blocking of third party cookies (eg: Facebook in App browser) can make this tie up highly difficult.

This domain is also supported by various ad tech technologies and tooling such as Google Analytics 360 offering various attribution models [1] [2].

Marketing Mix Modeling: is a statistical techniques that aims to measure and forecast the impact of different marketing activities on sales. It is aimed to provide insights on how to better allocate one’s advertising budget in order to obtain the right marketing mix and maximize sales and ROI.

In comparison to attribution modeling, marketing mix model take a helicopter view of the situation and try to attribute the impact of investment across marketing channel at an aggregate level, as such it is not prone to the same issues in terms of tying up identities, but does not offer the same granularity.

Audience Targeting and Ads Personalization: Understanding the audience profile, personas and how these can be used to ads personalization purpose also falls within the advertising analytics domains. Techniques such as lookalike modeling help identify potential segments to target for different types of offers based on behavioral characteristics. Facebook for instance extended the traditional lookalike modeling to introduce value-based lookalikes.

Allot of the work that is needed from analytics practitioners here, is in data integration of one form or the other, be it through the placement of trackers on the website, or on the integration of CRM data. Some analysis and interpretation of the data is also needed to guide the marketing teams in making the most effective decisions.

Tag Management systems such as Ensighten and Tealium and the combo of CDP, DMP and DSP help provide this type of audience targeting and ad personalization experience. Various vendors are active in the CDP and DMP world such as mParticle, Relay42, MediaMath, Krux or Adobe.

AB testing & Personalization also have a high importance within this domain but each deserve to be treated as an individual domain of e-commerce data strategy.

For effective advertising analytics, the needs for data relies on granular data availability on spend by channels and on sales data. Data availability of the different customer touch point, helps attribute sales more effectively and get a better understanding of the different customer segments, and for this purpose having as much customer data as possible is desirable.

Customer relationship management (CRM)

Customer relationship management typically makes use of analytics through enhancing the different contacts the company has with the customers. CRM needs to balance out contacts, content and offers and analytics is there to make sure that these are selected appropriately and tailored to the customer.

Segmentation: Segmentation allows for a better targeting of offers in marketing campaigns, and selecting the right targets for different types of offer has long been one of the area where analytics has been adding value. More than just a selection of basic attribute, the contribution of analytics to segmentation is through the creation of “smart” segments. These “smart” segments are typically created using techniques such as RFM, behavioral clustering (K-Means, DBscan…), decision trees or propensity models. These techniques allows analysts to identify different segment of customers based on their behavioral patterns or predicted responses and better tailor the audience to the intended communication increasing relevancy.

Example Customer Journey

Customer Journeys: The importance of customer journeys within CRM activities have been pushed by Omni-channel strategies. Customer journey allows to better manage event driven marketing, and they are supported by different enablers in real-time triggers that allow to provide just-in-time contextual offers, next best action that allows for the selection of the most appropriate offers as well contact policies to ensure we don’t create customer fatigue.

These have created a real need for CRM activities to be properly supported by data and technologies. Decision engine, customer profiles and predictive analytics are key enablers to supporting this vision of digitization. While, in turn, analysts can provide a lot of value by providing an understanding of how different class of customers go through the different steps of the journey and what could be done to increase the journey’s impact.

Customer Lifetime Value: getting an understanding of how the different factors of customer lifetime value evolve and are impacted by different sets of activities. Tracking and understanding churn and retention behaviors, how to stretch ordering patterns through promotions, up-sell and cross-sell and how to best support these activities with data driven recommendations.

The management of a customer lifecycle and taking the appropriate measures to increase lifetime value are a key part of an effective CRM initiative. These can be enhanced by providing a heavy data driven focus in terms of enabling the CRM tool decision making. These different data initiatives are supported by different marketing automation platforms such as Salesforce Marketing Cloud and Emarsys. These platforms leverage both the data present in their specific customer profile and data-sets to create journeys, segments and optimize offers being sent to the customers.

AB Testing

AB Tests and more generally online experiments consists in providing different variants to different groups of customers or visitors. Their aim is to understand what is the impact of rolling out a feature or offer to a set of customer.

A/B Tests acts as a certain measure of control when rolling out new features, making sure that these features do not contribute negatively to the experience or to the business. They also act as a performance measurement process in order to measure the uplift generated by different initiatives.

An experiment is normally composed of three phases, experiment definition, creation and evaluation:

Experiment definition: The experiment definition is the first phase of the experiment process, it is a phase where Product Managers start providing requirements, designers provide designs as to how the experiment would look like, and for data professionals, the experiment definition phase is the time to answer a few question:

  • Which segments of the population should be exposed to the experiment and in under which condition should this fire
  • how should the groups get allocated considering other running experiments
  • What metrics should we look at and what should be the expected uplift and conditions of success
  • For how long should the experiment run and what sample size are we meant to reach

Experiment creation: The experimentation creation phase is where developers spend their time to code and build the actual experiment. They typically build out based on requirements provided by the PMs, the designs provided by the designers and the data requirements provided by analytics.

The role of the data professional in this phase is to partner with engineers to ensure that logging is in place and of good quality, as well as create the pipelines to integrate events and logs into an experiment analytics framework.

Experiment evaluation and analysis: The experiment interpretation is the phase where analysts shine. The interpret the experiment results, deep dive into the experiments result to get a better understanding of customer behavior and get a sense of what drives customers to certain actions. They help the business understand whether a certain experiment should be rolled out or rolled back as well as get any valuable learnings out of what has been attempted.

There is a wide range of software enabling to run experiments on e-commerce website, from Google Analytics providing easy access to data to experiment setup software such as Google Optimize, Optimizely & Qubit. More tech driven companies tend to also have their own experimentation stack such as Uber with it’s experimentation platform Uber XP [2].


I previously wrote about the principles of e-commerce personalization, and how personalization can happen with different level of identity of a customer, from anonymous personalization to 360 degree view personalization. Personalization has been proven to have a variety of benefits such as increase engagement, conversion, loyalty and much more. Different analytics techniques exists to drive personalization such as segmentation, clustering, collaborative filtering, propensity scoring models, association rules, …

Setting up an effective personalization data strategy is based on three different pillars:

  • Collection: what data can reliably be collected and at what degree of timeliness, quality and completeness
  • Modeling: What type of data modeling is required in order to successfully power the level of personalization required
  • Activation: How the personalization meant to be exploited, is it through offering targeting options,..

Personalization in itself can be done with little data but benefits from leveraging as much data as possible. The difficulty in leveraging that data is in making sure that there is the right infrastructure in place to collect and act on the data in an unified and systematic way.

A multitude of technologies exists to power this effort that is both onsite and offsite to help power personalization. Customer Data Platform such as Apache Unomi or mparticle to help obtain 360 Customer Profiles, Data Management Platforms such as Krux of Adform to enrich anonymous profile with 3rd party data and personalization components of CMS such as Bloomreach or Wagtail.

Wrapping up

Setting up a data strategy for ecommerce requires quite a distinct set of data, domain knowledge and systems to help power that effort. Setting up a data strategy should be seen as an iterative process, looking at what can add the most value at each epoch. A data strategy needs to consider the full value chain of data from collection to usage and understand how a particular piece of data can be used to deliver value, and ensure that there is the specific resources (people, software, business) needed in order to turn data into value.