3 steps to a data story in e-commerce

Reach.ly
5 min readApr 1, 2015

The great thing about e-commerce is the “e” part of it. It means that anything can be observed, quantified and analysed. In recent years the availability of data has grown tremendously. With Google Universal analytics you can now add as many new data dimensions as you want. What has not grown as fast is our ability to use this data. Many people still look for that one, simple average value, which often is misleading. My aim with this post is to give you three simple steps to read your data and tell a simple, but data-driven, story. As an example I will use sales cycle length — how long it takes for someone to finalise their purchase process measured in time. It can also be measured in page views, visits, product views and sliced based on source but that is a topic for another post.

1. Map out your data

As with every journey, you want to see the bigger picture and understand scale and size. Datasets are no different. You have to map them out. For this article I have taken 1711 actual purchases done during one month period at one of our client’s online stores. Every sales cycle is measured in seconds. For convenience I have transformed that to hours and minutes. The longest sales cycle is 576 hours or around 24 days. Well, for some people it takes time to figure out what they want. Lets see how our graph looks if we plot how many people have finished their purchase in each hour.

What can we see here? First of all it is a range — from a couple minutes to three weeks in some extreme cases. The majority of purchases happen in the first hours but, interestingly enough, there are bumps every 24 hours. Those show that if people don’t buy right away, they might come back exactly 24 hours later. Small bumps can be seen for all the other days.

2. Know your averages

In everyday life we are used to talking about an average value but in reality there are several numbers which represent an average in their own way. The first average we are looking at is the one used a lot in everyday life — mean.

The mean is the average of the numbers: a calculated “central” value of a set of numbers. To calculate: Just add up all the numbers, then divide by how many numbers there are.

By the way Google uses this to calculate average session duration and other metrics.

So where would this be on our graph?

Here you go, mean is that red line or 57 hours or 2,4 days. By looking at the graph you might notice that the mean value stands in some kind of deserted place. Not that many people actually finalise their sales cycles around that point. By saying “On average it takes 2,4 days to finalise purchase on our store” you talk about this particular point and it describes really small portion of real customer base.

There is another average value which is underused. It is called median and it shows the middle of all the values.

In statistics and probability theory, the median is the number separating the higher half of a data sample, a population, or a probability distribution, from the lower half.

In our case that would also show point where 50% of all purchasers have done their purchase.

That would be that yellow line towards very left of the graph. Precisely at 1 hour and 17 minutes. Quite big gap from our previous average.

Most of the times people look for one number to describe majority of people. There is just number for that — mode.

The mode is the value that appears most often in a set of data.

For this we will zoom in to first three hours of sales cycle.

In our case mode is 8 minutes. That is peak of our dataset. In other datasets there could be several peaks and mode would be missleading measure.

3. Read distribution

We played around with various average values but there is still value we can extract from distribution. As we found 50% mark, lets look at 25% and 75% marks. Respectively how long it takes for 25% and 75% of all customers to finish their purchase.

25% of all purchases are done in around 16 minutes. Whole hour less then next 25% of customers. And finally 75% are through 86 hours or 3 days and 14 hours. This is even beyond our mean value.

These percentages are called quartiles and give good indication about distribution.

Wrap up

So what is our story? 50% of our customers finalise purchase in 1 hour and 20 minutes and 25% do so even less then in 20 minutes. One quarter of customers have a sales cycle beyond 4 days and the highest concentration of customers is around 8 minutes.

Time is just one dimension and it would be worthwhile to look at this data in a broader context — whether these people come from a particular geography or a specific referral, what kind of products they are buying, etc? It might be that cheaper and smaller items are sold in the first minutes and more expensive goods take more time. Based on this insight, one could make more targeted campaigns.

In three simple steps we have got to bigger picture about how people actually are buying product on your online store and developed several questions for further research.

Further readings

Originally published at reach.ly on March 30, 2015.

--

--

Reach.ly

Uncovering customer on-site behavior in real-time. Sharing some related thoughts here!