Identify Customer Lapsation in Retail Sector — A Data-Driven Approach

Shreya Sagar
Capillary Data Science
4 min readOct 26, 2021

In the retail industry, there are various types of KPIs that help us to measure the performance of a brand. These KPIs can be Sales Based like Lifetime sales of a user, Average Transaction Value (ATV), Sales per visit, or can be Product Based like Most purchased category, most purchased product, etc or Customer Behavior-based like total visits, total quantities purchased, average latency, recency, etc. In this article, I will be explaining one of the most important factors called “Lapsation Curve” which is useful for a brand while strategizing any engagement plan for their customer base in a simple manner.

Let’s start with the basic understanding of the Lapsation Curve:

What is the Lapsation Curve?

Every brand has a designated period defined as an active period. If any customer does not come to shop with the brand in this defined period then this customer will be given a tag of the lapsed customer. To identify this period, we use Lapsation. Basically, Lapsation Curve is a mathematical representation of “after how many days our 80% customer base is making a repeat visit with us”.

Most of the brands use the Lapsation curve to find out the active period for their brand. Once they have the active period, they use this information to group their customers based on their latency. If the latency is less than the active period then those customers are called active customers. Furthermore, the brand can define various different tags apart from active tags like dormant, perished, lapsed & lost. All these tags can be identified using the active period found out from the lapsation curve.

How to plot Lapsation Curve?

Firstly, make a data frame with user-wise count visits and calculate average latency. Then remove all the users with visits less than two. We will plot this curve for only repeat users.

Sample Dataset:

Formula for Latency: datediff(max(bill_date),min(bill_date))/(count(distinct bill_date)-1) Formula for Visits: count(distinct bill_date)

Output1:

After getting this data, make another data frame using this. This data frame will have a latency-wise count of users. After getting this data, we will find the percentage of customers for each latency number. This will be a cumulative sum.

Output2:

After getting a cumulative sum of users for each latency, plot this using either matplotlib or seaborn or pandas plot. Below is the code for a seaborn plot with some additional visualization to make the graph more readable.

Plot Code:

Output3:

Some Retail Examples:

If we talk about the retail industry then we have multiple verticals like FMCG, apparel & footwear, jewelry, luxury, etc. All these verticals have different lapse periods based on their customer purchase behavior. Following are some of the standardized values used by some of the Indian and Southeast Asia Brands:

  • Apparel & footwear brands have an average lapsation period between 270 days to 300 days.
  • Jewelry or Luxury brands have an average lapsation period between 365 days to 460 days.
  • On the other hand, if we look at Supermarket chains or the FMCG sector they have much less lapsation period in the range of 80 days to 150 days. This is because the customers tend to shop again in these brands within a quarter.

Concluding Remarks:

Using the above curve, a brand can decide on an engagement strategy for its customers. Brands usually use this to divide their loyal/Active customers from their Lapse customers. Once they have these buckets, they can target these specific customers based on their other purchase patterns like ATV, ABS, etc, and can achieve a maximum ROI.

Use the above method to find Lapsation Curve for your brand and maximize your ROI by building a better engagement plan.

--

--