Behind the Design: Optimizely’s Personalization Results

Published in

Design @ Optimizely

9 min readSep 21, 2016

In October of 2015, we launched Optimizely Personalization. This new product makes it easy for our customers to serve customized content to specific groups of visitors. By doing so, our customers can provide a superior experience to their users, gaining a competitive advantage that increases conversions and revenue.

A key component of the product is measuring and showing the impact from personalizing your website. This presented a unique design challenge since we needed to communicate information that customers weren’t accustomed to seeing, in either our product or other analytics products. As such, this project went through many iterations and rounds of user feedback, which I describe in this post.

Background

Before we dive into designs, some background information about how personalization works is helpful to contextualize the work. Personalization is centered around “Campaigns,” which are a set of audiences that will each get a customized experience when visiting specific pages of your site. For example, you can show shoes to people who have browsed shoe products recently, hats to people who have browsed hats recently, and so on. Read more about how to set up personalization campaigns on Optimizely’s blog.

A key part of running a campaign is knowing the impact of that campaign, so that you know if it’s increasing conversions. Based on our early research, we knew we needed to show people the impact of the campaign as a whole (i.e. across all audiences), and the impact of personalizing to each audience.

Initial Version

Based on our initial generative research about personalization, we knew customers cared about the campaign’s overall improvement to their main conversion metric (typically revenue). This helps people show their boss the impact their work is having, in addition to telling people if personalization is helping at all. People also commonly have additional supporting metrics they want to track as well, like “Add to Cart” events, which we need to display. We also learned it was important to show audience reach, as audiences with smaller reach were likely to have a lower impact.

Starting with just these basic insights we designed an initial version of the page.

*Initial version of the Personalization Results page*

Phase 1: Incremental Improvements

Since our approach to personalization was a fairly new concept for most people, there were a number of problems with our first version of this page:

We were introducing a lot of new numbers and metrics without any clear guidance around what they meant or why they were important.
There was no clear design hierarchy. Most customers were doing this kind of personalization for the first time, so they didn’t necessarily know how to interpret the data. And since we were also doing personalization for the first time, we weren’t sure how best to display results and what performance indicators were most important.
The language wasn’t very comprehendible. There was a ton of new terminology and really big pop-tips with long, detailed explanations of these new concepts.

*A smattering of numbers we showed customers*

So for phase 1, we focused on incremental improvements to fix the most egregious issues on the page, such as exposing some raw numbers and fixing the copy to make it clearer.

One specific example is that we had a “Total Increase” metric to show the raw increase in conversions (as opposed to a percentage). But we didn’t really define it anywhere in the UI, and the popover copy was hard to understand. So we fixed the copy and made the explanation in the popover clearer and more concise. We also exposed more of the raw numbers behind the aggregated metrics, since customers wanted to see those to help them understand how we were calculating the results. It also increased their trust in our reporting.

*Incremental improvements to the results page*

Phase 2: No Sacred Cows

Phase 2 of our redesign was the big one: we were willing to scrap anything and take big leaps to make the page truly great. Now that customers were having an easier time interpreting the data, we could dig deeper and get more valuable feedback to improve the customer experience.

We started with another round of research, in which we learned users were still having trouble interpreting the new concepts and taking any kind of action on this data. They also didn’t quite know what number to focus on. The total improvement and increase in conversion numbers had equal prominence and fought for attention. This was especially true in cases where one number was statistically significant and the other wasn’t, which made it difficult to know how to gauge the success of their campaign. We wanted the results page to tell a story — to be descriptive enough for a user who is new to personalization, but at the same time provide enough value for deeper analysis so a customer could take action on their data.

Based on this research, we focused separately on the campaign summary and the audience detail pages. The campaign summary gave a high level overview of how the campaign was doing for the selected metric, and the audience detail page provided detailed stats on how each audience was performing for every event.

For the campaign summary, we broke up the page into sections. Each section would tell a portion of the story, with a title and subtitle to describe what the particular section was trying to communicate.

Section 1: Overall Improvement

This section of the results page communicates the the overall performance of this personalization effort.

We got rid of total increase and experimented between “Net Impact” and “Overall Improvement” as the primary number on the page.

After a few user testing sessions, “Overall Improvement” was the clear winner. It matched people’s mental models, was similar to A/B test results they were used to, and overall was more understandable than “Net Impact.” We also added a confidence interval, a sentence explaining how this number was computed, and an audience comparison graph to show how improvement levels had changed per audience over a period of time. Several design iterations resulted in this being the most popular view.

Section 2: Campaign Reach

This second card meant to show how traffic was flowing and how sessions were being distributed. This would help customers know who was being bucketed into the personalized group versus holdback and why.

There were a ton of new terms and concepts that were introduced with personalization and users had to read through copy and see how all these concepts connected. We wanted to depict this data visually — in a way that showed how these numbers were connected and how visits in a campaign were distributed. We explored a bunch of ways of visualizing this data — tree diagrams, donuts, pie charts, and more.

*Campaign reach with modified pie charts*

*Campaign reach with funnel visualization*

We showed these to users, and the funnel visualization was the easiest to understand. Users quickly understood how much traffic their site has, and how much of it was on pages this campaign is running on. It also showed how much of the page traffic was personalized, and finally the conversion rates for personalized and non-personalized visitors.

Section 3: Audience Breakdown

The 3rd section was audience breakdown, which communicates how audiences are performing relative to each other. This could shed light into whether audiences were too small, or whether underperforming audiences needed to be shown a better experience. The main question we were trying to answer here was: What was the overall impact that each audience had on this campaign?

Initially this was a simple bar chart where the height of the bar depicted the reach of an audience.

We learned from research that this alone served no purpose and users wanted to see reach relative to performance, so as to know how effective each audience personalization was. Once again, we explored many visualizations to discover the clearest one.

*Audience breakdown with a modified bar chart and improvement on the right*

*Audience breakdown with a bubble chart*

*Audience breakdown with variable-width bar chart*

After showing these to some users, the clearest one was a variable width bar chart that used both the x and y-axis to show audience reach (x-axis) relative to change in conversion rate (y-axis). Together the width of the bars made up 100% of campaign traffic, i.e. all campaign audiences plus everyone else not being served a personalized experience. The y-axis showed change in conversion rate (lift). The area of the bars visualizes how well an audience was doing relative to the reach. So an audience with 80% reach and 10% lift was just as impactful as an audience with 10% reach and an 80% lift.

Section 4: Improvement Table

The last section is a breakdown per metric aggregated across all audiences. This was looking at the campaign from the lens of the metric instead of the audience. The first version of this section initially had a table with improvement numbers for all events.

The feedback we got from users was that this information was too exhaustive and hard to digest. We ended up replacing this with several cards that showed how all metrics were performing for all audiences. This showed how a given metric was performing overall, with improvement being the main driver of performance.

*Individual cards for improvement by event*

Final Campaign Summary Page

After iterating on each of these sections the final Campaign Summary page looked like this:

Audience Detail Pages

Since the breakdown of how a specific audience is performing in a personalization campaign is the same as running an A/B test targeted to that audience, customers were expecting to see certain data they’re used to seeing in A/B test results, such as improvement and statistical significance. They also missed the visual detail that existed in the A/B results page but wasn’t adopted in the personalization results page.

Luckily, fixing this was simple: we just needed to update the page to have the same data as our A/B test results page. We added the graphs from that page, improved table layout, and did some visual polish. We also added a new descriptive dropdown, icons, and colors that communicated whether the personalized variation was statistically significant or not. This helped customers interpret the results in a single glance, reducing cognitive load.

After several rounds of research and many iterations later, customers can confidently interpret the impact of their campaigns. We’ve heard from several customers that they find the Net Impact numbers in the “Audience Breakdown” table valuable since it helps them know who their most valuable audiences are. This is especially true when these are monetary numbers because it literally translates to how much extra money each audience is bringing in. As our customer base in personalization grows we hope to keep learning through customer feedback so we can make the data more actionable and valuable to our customers.