Deliver Your Customer Behavior Analysis Report with Streamlit

cindyangelira
Octopus ID Data
Published in
8 min readJan 24, 2023
Source : Deployed App via Giphy

Customer Behavior Analysis

Since customer-centricity has been driving business strategies for quite some time, the analysis of customer behavior turns into a vital aspect for the success of an e-business or e-commerce. In simple manner, this analysis depicts the data-powered observation of online customers and how they interact with the company.

Based on customer behavior data, company will be able to explore and plan the corresponding strategies to enhance their top and bottom line of business. For instance, with the support of customer personalization, company gains better understanding on how likely potential customers will respond to new products or new services, avoid particular customers be churn, design the cross selling product recommendation, find best customer with similar behavior and replicate them for the purpose of A/B testing, and many more.

The customer behavior analysis techniques covered in this article listed below.

1. RFM Segmentation

According to Pareto principle, 20% of the customers contribute to 80% company’s revenue. Segmentation gives a good understanding of the need of the customers and helps in identifying the this potential customers that give high contribution the company.

Recency, frequency and monetary (RFM) analysis is a powerful and recognized segmentation technique in marketing analytic. It is widely used to rank the customers based on purchasing history. This method groups the customers based on three dimensions, recency(R), frequency (F) and monetary (M).

a. Recency — When was the last time the customer made a purchase?

Recency value is the number of days a customer takes between last order and snapshot date (analysis date).

b. Frequency — How many times did the customer purchase?

Frequency is defined as the number of purchases a customer makes in a specific period. The higher the value of frequency the more loyal are the customers of the company.

c. Monetary — How much money did the customer spend?

Monetary is defined as the amount of money spent by the customer during a certain period. The higher the amount of money spent the more revenue they give to the company.

With customer segmentation, company can have a sense of who their customers are and what type of segment they have. With this finding, they later can customize marketing plans, identify trends, plan product development, create user personalization, and so on.

2. Market Basket Analysis

Market basket analysis is basically a method to gain insights into granular behavior of customers and understanding their purchasing patterns. This method will reveal items that are likely to be bought together by looking for combinations of items that occur together frequently in transactions data. It utilized the Association Rules concept which are widely used to analyze retail basket or transaction data.

In short,

“Frequently Bought Together” → Association

“Customers who bought this item also bought” → Recommendation

The method itself might seems simple, but with the findings, company will be able to hence the product recommendation, cross selling planning, and many more.

3. Customer Lifetime Value

Although acquiring new customers play important role for the company’s growth, optimizing the lifetime value of existing ones will costs less. Thus, increasing the value of our existing customers is a great way to scale up the growth. To increase the value, company will invest in customers, such as bombing them with promotions. In fact, these actions make some customers super valuable in terms of lifetime value but there are always some customers who pull down the profitability. All in all, customer lifetime value is all the gains.

Methods to predict the CLV vary from calculation formula approach to probability or probabilistic approach. Most widely used of probabilistic approach is ‘Buy Till You Die’ model. One of them in particular, the BG/NBD Gamma-Gamma model, has seen widespread adoption.The BG/NBD model will predict the future number of transactions per customer. The result from this model will fed into the Gamma-Gamma model to predict their monetary value.

The BG/NBD model estimate the parameters of two probability distributions:

  1. The Gamma distribution, from which come the customers’ individual transaction rates of λ.
  2. The Beta distribution, from which come the customers’ individual churn probabilities of p.

Streamlit

One pain point that data scientists face after they found insights or develop a new model is to decide the best way to share the dynamical result finding. They might create separate document and power point or export the Jupyter notebook file. But, despite Jupyter notebook is a superb tool that enable data scientist to play around with data, it has some issues such as non linear execution model and the difficulty to test as it is not a tool for production. Let alone, it doesn’t work for user input and extremely hard to grasp by non-technical users.

This is where Streamlit comes in handy. Streamlit is an open source Python framework released in October 2019. It acts as an a tailwind and medium between data and user interaction within application. The prototype application is easy to use with no prior knowledge of web development such as Django, Flask, etc.

In addition, this book¹ claim, the important reason that makes Streamlit is widely used are because it is open source, interactive, easy to learn, can run on any platform, can be developed in no time with less code, provide auto-reloading feature, can implement the pre-trained models, doesn’t require deep web development knowledge to develop the data-driven app, and support various Python libraries for EDA, computer vision, ML, and data science.

In this article we will take you to build our reproducible customer behavior analysis report by utilizing the Streamlit library (CLTV will be explained in the next article).

Hands-on Customer Behavior Analysis with Streamlit

Dataset

In this article, we are gonna use public e-commerce dataset that originally made by The UCI Machine Learning Repository. This is a transnational dataset which contains all the transactions occurring between 01/12/2010 and 09/12/2011 for a UK-based and registered non-store online retail.The company mainly sells unique all-occasion gifts. Many customers of the company are wholesalers. Data can be fetched from Kaggle (click here).

Getting Started

Create and activate the virtual environment with these typical steps.

  • pip install virtualenv : install virtualenv
  • virtualenv venv : create new virtual environment named venv.
  • source venv/bin/activate : activate venv virtual environment.
  • pip install -r requirements.txt : install the required libraries.

Set Streamlit Configuration

Please refer to github link provided in the end of article for complete script.

Our customer behavior analysis report is structured as follows.

  • Home page : a brief description about the dashboard report.
  • Data Exploration page : EDA with pandas-profiling.
  • RFM Segmentation page : customer segmentation via RFM.
  • Market Basket Analysis page : frequent bought together item.
  • Customer Lifetime Value Prediction page : CLTV prediction with BTYD model.

Perform Exploratory Data Analysis with Pandas Profiling

The first component if our streamlit app is EDA page. As we know, before data scientists perform the sophisticated modeling technique, they typically start with the understanding of the basic pieces of dataset such as data info, missing value, distributions, and so on. Therefore, we’ll automate the EDA process with pandas-profiling, which is a very powerful Python library.

Please refer to github link provided in the end of article for complete script.

Our Data Exploration page later will look like this:

Develop The RFM Segmentation

Here what RFM Segmentation page do:

  • Select all appropriate variable in slider and filter it by specific date range and region (region filter is optional, you can also select None) that user chose.
  • Filter Recency, Frequency, and Monetary Range.
  • Select the number segment that want to be generated. In this step, we use two RFM Segment approaches, (1) Apply k-means clustering to scaled R,F,M variables (2) Give score 1–4 to each R,F,M variable and then generate segment based on k-means result from this scores variable.
  • Display segment exploration and feature importance.
Please refer to github link provided in the end of article for complete script.

Our RFM Segmentation page later will look like this:

Develop Market Basket Analysis

Here what Market Basket Analysis page do:

  • Select all appropriate variable in slider and filter it by specific region (region filter is optional, you can also select None).
  • Generate item insight visualization.
  • Filter the market basket dataset based on minimal support that user input and display the association rules of the basket data (frequent bought item).
  • Generate recommendation frequent bought item based on selected item that user chose.
Please refer to github link provided in the end of article for complete script.

Our Market Basket Analysis page later will look like this:

Create The Multipage App

Please refer to github link provided in the end of article for complete script.

Deploy the Streamlit

By and by, we run our app. You can launch the app on your local browser by running streamlit run app.py but we’ll make it accessible to other parties/users. Various public cloud platforms which are free and easy to use are available for us to deploy our app, for instance Streamlit Cloud, Hugging Face Spaces, and Heroku App. Each of these clouds has their own pros and cons. For now, we choose to deploy our app to Streamlit Cloud as it’s the one which fastest to load the app when accessed.

Source : Author

Finally, here is our customer behavior analysis report.

Source : App Deployed

However, I notice that sometime the app will be crashed in Streamlit Cloud. I recommend you to clone this github repository and run it on your local machine.

Thank you for reading!

References

And, shout out to cool articles and book reference that make this one happens.

[1] Beginner’s Guide to Streamlit with Python : Build Web-Based Data and Machine Learning Applications.

[2] Damian Boh. 3 Easy Ways to Deploy your Streamlit Web App Online. TDS Article.

[3] Countanst. Why Consumer Behavior Analysis Is So Relevant to the eCommerce business? Medium Article.

[4] Albers Uzila. 8 Simple Steps to Build Your First Streamlit App. Medium Article.

[5] Stef Smeets. Forget about Jupyter Notebooks — showcase your research using Dashboards. Medium Article.

[6] Alexander Mueller. 5 reasons why jupyter notebooks suck. TDS Article.

[7] Barış Karaman. Customer Segmentation. TDS Article.

[8] Hua Shi. Consumer Behavior Analysis — Click the Ad or Not?. Medium Article.

[9] Tirth Shah. Market Basket Analysis. Medium Article.

[10] Michał Oleszak. Buy Till You Die: Understanding Customer Lifetime Value. TDS Article.

[11] Adam Brownell. Customer Behavior : Buy Till You Die Model. TDS Article.

--

--

cindyangelira
Octopus ID Data

A data wizard at least for now, until my inevitable career switch to becoming a fullstack monarch