Optimizing Cart Upsell Blocks for Increased Average Order Value and Revenue

Don’t jump too quickly into solution. It may not work for you. But the process works with win rate 40–55%.

Kyrylo Horban
7 min readDec 2, 2023

Business: Plants and Seeds Retailer

Goal: Increase AOV from cart by 22%

Problem: Identify the blocks that will most effectively maximize the likelihood of hitting the target (22%).

My role: Conversion Rate Optimization Expert


Challenge 1: Inadequate Cart Upsell Strategy

Research setup: My initial analysis revealed that the existing cart upsell strategy was underperforming, failing to maximize AOV.

Resolution: I revamped the strategy by focusing on data-driven recommendations and aligning it with user preferences.

Challenge 2: A/B Test Complexity

A/B Test Preparation Challenge:
Preparing for the A/B test was intricate, involving changes to cart upsell blocks that required precise execution.

Resolution: To mitigate complexity, I phased the A/B test implementation, allowing us to monitor and adapt to changes more effectively.

Challenge 3: Timeframe

Business has 2 explicit seasonal peaks. but between those peaks it’s hard to acquire enough targeted audience to run A/B test.
We decided to run BEFORE season to mitigate the risks of skewness in data.

Industry practices

I studied 50+ leaders and discovered promising solutions related to cart upsell blocks. Some observations

  • Personalization: Leading e-commerce businesses emphasized personalized product recommendations within the cart.
  • Bundle Offers: Successful stores offered bundle discounts and incentives in the cart, probably driving higher AOV.


Hypotheses Generation. Laser focus here.

Hypothesis should be taken VERY seriously. It’s a goldmine if done right.
It can make or break whole process and final outcome, assumming the reasearch has been done properly.

Before we jump into hypothesis itself, let’s get break it down into steps on a basic example.

My hypothesis rank
Note: This is a hypothesis test whether there is a difference between two or more groups.

0 — Poor hypothesis, 5 — Solid hypothesis

0/5: Men earn more than women.

1/5: Men earn more than women in the same job.

2/5: Group of men earn more than a group of women in the same job in Austria.

3/5: Men earn more than women who are visual designers in Austria.

4/5: Men earn more than women who are visual designers having a Bachelor degree in Arts in Austria.

4.5/5: We observe that over the last 2 years men earn more than women in their average monthly net income who are visual designers having a Bachelor degree in Arts in Austria.

Now, break it down
There is an [observation] within a [specific timeframe], where [group 1] has [difference] between [group 2] in terms of [metric] that belongs to [group attribute 1] and [group attribute 2] in [Geo segment]

This simple example helps me to formulating solid hypotheses before testing them.

Now lets apply the same method for a cart upsell case based on research.

Leveraging the research findings, direct user feedback, and industry practices, I formulated the following hypotheses:

Hypothesis 1
Customers who add only 1 item to their cart are 3.6 times less likely to convert than those who add at least 2 items from different categories. This is likely because there is a minimum order value required to proceed to checkout.
This is because there is a minimal pricing limit to proceed to checkout. If we prompt customers who have only 1 item in their cart to consider adding more, it may increase AOV.
Exposing gift(s) with purchase block in the cart who reach [above_the_median_AOV] and another one [75_percentile_AOV] we’ll likely to hit the target.

Null hypothesis: There is no difference between customers who add only 1 or more items to cart whether we add gift(s) with purchase block in terms of increasing AOV

Hypothesis 2
Internal data shows customers often fail to find desired strawberry products due to search limitations around synonyms and typos. Improving the search to address these issues can increase users’ ability to find and add relevant items to their cart. If we tune search engine, users will more likely to find and increase probability adding to cart.

There in no Null hypothesis:
It’s unrealistic to A/B test search with improves synonyms and typos.

Hypothesis 3
Allowing users to save their cart by adding an email will reduce cart abandonment rates. We can then use email sequences with relevant recommendations to reach out to users with saved carts. This can increase average order value.

Null hypothesis: There is no difference between users who will save cart with email and convert with those who won’t be exposed the block.

Hypothesis Prioritization

Prioritizing Hypotheses: Recognizing business’s objectives, I prioritized Hypothesis 1. Its high potential impact, moderate implementation effort, and strong alignment with AOV goals made it the logical choice.

If this won’t work, we’ll A/B test hypothesis 3.

Let’s see…

A/B Test Implementation. Not an easy one.

Creating Variations:
We executed Hypothesis 1 by introducing a GWP block

Setup:We decided to run a server-side experiment with Google Analytics integration
A/B test 50%/50% all traffic as business acquires diverse traffic sources.

Cart: Original (Control / Baseline)

Cart: Variation 1 (Treatment / Experiment / Alternative)

A/B test implementation and monitoring

We had to stop 3 times in a row the experiment.

A/B test groups must be similar in how long users have been customers on average

Historically most visits come from users who visited within the past 3 months. But the test variant group somehow has more users who joined over 2 years back.

So there is an apples-to-oranges problem, ie relatively new vs loyal.
Baseline group: 80% visited the old cart in their first 3 months as new members.
Experiment group: 60% visiting the new cart have been active for over 2 years.

The problem
When I look at engagement metrics, the long-time established users acted differently than the recently joined newcomers — regardless of the cart design change. It may skew A/B test results.

Solution: Pause experiment

When we ran second time, in a week there was a mobile skew 2x deviating from average.
It turned out to be Incorrect Site Tagging.
Tag manager failed to fire properly only on mobile web.
This would selectively direct mobile users rather than randomizing correctly.

Solution: Fixed by QA’ing all tags across device experiences.

Third time it was a… bug in implementation caused inintentionally with dev inventory update. When users reached required sum to checkout, GWP didn’t work, so variation performed worse.

Solution: Fixed bug immediately.

It’s always, always important to monitor metrics in order to be sure about the expected results. Unfortunately I’ve witnessed this a lot. Even working for big brand who didn’t care about this.

Data Collection and Analysis

Key Metrics Tracked:

Statistical Significance:

  • Primary Metric 1 (AOV):
    Our analysis revealed a statistically significant increase in Average Order Value (AOV) in the experimental group compared to the control group, with a p-value < 0.10
  • Primary Metric 2 (Revenue):
    The test also demonstrated a significant improvement in overall revenue, with a p-value < 0.10

Note: I set p-value to 0.10. We are not a pharma company and for business like this it’s ok to tolarate changes made by random chance to 10%

  • Secondary Metric (Conversion Rate):
    The conversion rate remained stable with no statistically significant difference between the experimental and control groups.

Practical Significance:

A 16.30% relative increase in AOV emphasizes the practical impact, leading to substantial revenue growth.

Statistical Power:

Our test exhibited a robust statistical power of 0.90 (90%), showcasing the high likelihood of detecting the observed effect.


Incremental Sales from Experiment: 60 orders
(we have slightly increased conversion)
Average order value: $107
Incremental revenue: 60 * $107 = $6,420
Cost to redevelop and maintain GWP: $650
Incremental profit: $6,420 — $650= $5,770
ROI = ($5,770/$650) = 9%

Overall Observation
The changes had a positive impact on both primary and secondary metrics. We didn’t meet the target 22%, but achieved 16,3%.

This allows us to extend hypothesis with iterations.

Next Steps

  1. Iterate on hypothesis 1 playing around with different GWP offers and pricing
  2. Enhance search engine capabilities to tolerate typos and synonyms
  3. Test Hypothesis 3
  4. Update new checkout experience

I’m sure you deserve better AOV, conversion and revenue, as well as maximizing ROI.

Kyrylo, compozio.com



Kyrylo Horban

Sr UX designer for Ecom and SaaS. Focus on Conversion Rate Optimization. compozio.com