Predicting purchases with Market Basket Analysis

Create your own “Customers who bought this also bought” section using MBA with Association Rules

Published in

Geek Culture

11 min readMay 26, 2021

Shopping Customers | Photo by Gustavo Fring from Pexels

Do you ever make impulse purchases? Sure you do. But, do you ever wonder why these products are so conveniently available to you even when you weren’t looking for them? All of us know about the Customers who bought this also bought section on Amazon, and the aforementioned impulse purchases happen there quite a lot.

Even in physical grocery stores, you’d find items which are complementary (e.g. Bread and Butter) on the same shelf or at least in close proximity to each other. This data of complementary items also helps the stores in giving offers and discounts on these items in some way that they deem profitable. The advertisements for one item can be targeted on customers of the other. Also, sometimes the company might come up with a combined product for the two which might increase sale.

Now, the question arises, how to find these complementary items? The answer is Market Basket Analysis.

What is Market Basket Analysis?

Market Basket Analysis (MBA) is a modelling technique based upon the theory that if you buy a certain group of items, you are more (or less) likely to buy another group of items.
For example: while at McDonald’s, if you buy sandwiches and cookies, you are more likely to buy a drink than someone who did not buy a sandwich.

In the retail industry, MBA refers to an unsupervised data mining technique that discovers co-occurrence relationships among customers’ purchase activities. The volume of sales made from user clicks on Amazon’s “Customers who bought this product also bought these products…” call to action links is a testament to the effect and importance of market basket analysis.

The objective of Market Basket Anaysis and this article is to predict with the use of previous data as to what product does a person buy after purchasing some product, or rather put simply, what relates to the previously bought product.

Some Terminologies

Now, we need to get familiar with the terminologies used here to get a clearer understanding of the topic.

Items

Items are the objects that we are identifying associations between. For an online retailer, each item is a product in the shop. For a publisher, each item might be an article, a blog post, a video etc. A group of items is an item set.

Item set, I = {i₁,i₂,i₃, … ,iₙ}

Transactions

Transactions are instances of groups of items co-occurring together. For an online retailer, a transaction is, generally, a, transaction. For a publisher, a transaction might be the group of articles read in a single visit to the website. (It is up to the analyst to define over what period to measure a transaction.) For each transaction, then, we have an item set.

Transaction, tₙ = {iᵢ,iⱼ, … ,iₖ}

Rules

Rules are statements of the form

{i₁,i₂, … } ⇒ {iₖ}

i.e. if you have the items in item set on the left hand side (LHS) of the rule i.e. {i₁,i₂, … }, then it is likely that a visitor will be interested in the item on the right hand side (RHS i.e. {iₖ}.

For example, the sandwiches and cookies from above example become the LHS and the drink becomes the RHS.

Methodology

Association Rule Mining

For finding frequent patterns, associations, correlations, or causal structures among sets of items in transaction databases.
To understand customer buying habits by finding associations and correlations between the different items that customers place in their “shopping basket”.
Rule Form: Antecedent Item ⇒ Consequent Item

Apriori Principle

The apriori principle can reduce the number of itemsets we need to examine.

Put simply, the apriori principle states that: if an itemset is infrequent, then all its supersets must also be infrequent.

This means if {beer} was found to be infrequent, we can expect {beer, pizza} to be equally or even more infrequent. So in consolidating the list of popular item sets, we need not consider {beer, pizza}, nor any other item set configuration that contains beer.

Now, we use three very important concepts of Support, Confidence & Lift in order to implement and understand Market Basket Analysis.

Support

The support of an item or item set is the fraction of transactions in the data set that contain that item or item set. Support determines how often a rule is applicable to a given data set.

Support(A ∪ B) = min(Support(A), Support(B))

Confidence

Confidence is defined as the conditional probability that a transaction containing the LHS (the antecedent item A) will also contain the RHS (the consequent item B).

Confidence(A => B) = P(B|A) = P(A ∩ B)/P(A)Confidence(A => B) = Support(A ∪ B)/Support(A)

A rule’s confidence is a measurement of its predictive power or accuracy. The confidence tells us the proportion of transactions where the presence of item or itemset LHS results in the presence of item or itemset RHS.

One drawback of the confidence measure is that it might misrepresent the importance of an association. This is because it only accounts for how popular apples are, but not beers. If beers are also very popular in general, there will be a higher chance that a transaction containing apples will also contain beers, thus inflating the confidence measure. To account for the base popularity of both constituent items, we use a third measure called lift.

Lift

Lift gives the correlation between A and B in the rule A ⇒ B.
Correlation shows how one item-set A affects the item-set B.
A and B are independent iff: P(A ⋂ B)=P(A) x P(B), otherwise dependent. Lift is given by:

Lift(A => B) = P(A ⋂ B)/[P(A) x P(B)]Lift(A => B) = Support(A ∪ B)/[Support(A) x Support(B)]Lift(A => B) = Confidence(A => B)/Support(B)

So, higher the lift, higher the chance of A and B occurring together.

Goals of Association Rule Mining

When we apply the Association Rule Mining on a given set of transactions X, the goal is to find all the rules with:

Support greater than or equal to min_support
Confidence greater than or equal to min_confidence

Steps for Market Basket Analysis using Association Rules

Collecting Data
Exploring & Preparing the Data
Training a Model on the Data
Evaluating Model Performance
Improving Model Performance

Data

Now, we are going to apply MBA on two datasets which were obtained from different sources, these are publicly available datasets from two stores.

Dataset 1

“Online Retail”, contains all the transactions occurring between 01/12/2010 and 09/12/2011 for a UK-based and registered online retailer, taken from UCI Machine Learning Repository

Dataset Description

Number of Rows: 541909
Number of Attributes: 08

Then After preprocessing, the dataset includes 406,829 records and 10 fields: InvoiceNo, StockCode, Description, Quantity, InvoiceDate, UnitPrice, CustomerID, Country, Date, Time.

The matrix contains 19295 transactions (rows) and 566 (columns) unique items bought by customer in the one month period.

1803 out of 19295 transactions contain WHITE HANGING HEART T-LIGHT HOLDER, while 1709 out of 9835 transactions contain REGENCY CAKESTAND 3 TIER.

Dataset 2

Groceries data from Department of Statistics and Biostatistics, California State University

The matrix contains 9835 transactions (rows) and 169 (columns) unique items bought by customer in the one month period.

2513 out of 9835 transactions contain whole milk, while 1809 out of 9835 transactions contain rolls/buns.
There are 2159 transactions that contain only 1 item purchased, and only 1 transaction with 32 unique items bought.

Results

Finally, let’s have a look at the results and inferences obtained after applying association rules over these datasets. These inferences are depicted below in a visual way with the help of graphs along with some more details to describe these graphs.

Dataset 1

This dataset from UCI Machine Learning Repository can be broken in different ways to make a lot of different inferences.

Time of people purchasing items

This figure answers the question at what time do people often purchase online.
There has been a clear difference between the hour of day and order volume.
Most orders happened between 10:00–15:00.
This helps the retailers to show more advertisements during this peak hour combined with the similar products from Market Basket Analysis.

Number of items each customer buy

The figure represents how many items each customer bought. People mostly purchased less than 10 items (less than 10 items in each invoice).

The top 20 best selling items

The figure above represents the top twenty list of bestsellers.

Absolute Item Frequency Plot for top 20 items

The absolute item frequency plot() shows the absolute quantity of a certain item that is bought in numbers.
It plots the numeric frequencies of each item independently.
The RColorBrewer library adds the colour to the plot.

Relative Item Frequency Plot for top 20 items

The relative item frequency plot() shows the relative quantity of a certain item that is bought in percentage.
This graph here shows the relative item frequency of top 20 items and the most frequently bought item is WHITE HANGING HEART T-LIGHT HOLDER.
The RColorBrewer library adds the colour to the plot.

Scatter Plot for the given data (49122 rules)

The scatter plot() is a plot for visualising the association rules where the darkness demonstrates the lift, the x axis is the support and the y axis is the confidence.
This is a plot for the 49122 rules extracted from the Dataset 1.
This demonstrates that most of the items have a support of less than 0.002.
It also shows that lift is maximum when the support is less.
The confidence level in Dataset 1 is much higher than in Dataset 2 (shown later). The scatterplot in Dataset 1 are all clustered around 0.01, but for Dataset 1, a neat trend is observed — logistically moving towards Dataset 2 as support increases.
As the rules in the Dataset 1 are much higher than in the Dataset 2, it depicts the real world analysis in a better way and hence provides a better scatter plot.
This concludes the observation with an amazing result that as the number of extracted rules increases, the confidence level tends to one, giving us an accurate result.

A Two Key Plot for the given data (49122 rules)

The Two-key plot() is like the scatter plot showing the x axis as support, y axis as confidence and the colour changes as per the lift as shown in the right.
This graph here shows the two-key plot for the whole 49122 rules extracted from the database 1.
It also shows that lift is maximum when the support is less.

Parallel Coordinates Plot for the rules

The Parallel Coordinates Plot() shows what products with what items produce what kind of sales.
This is a parallel coordinates plot for 50 rules from the database.
It shows that if someone buys BILLBOARD FONTS DESIGN, they buy WRAP next and the darker colour shows that the confidence is high.

Dataset 2

This dataset from Department of Statistics and Biostatistics, California State University can be broken in different ways to make a lot of different inferences.

Relative Item Frequency for the Top 10 Items

The itemFrequencyPlot() allows us to show the absolute or relative values.
The figure above shows the relative item frequency for the top 10 items in the first dataset.
It plots how many times these items have appeared as compared to others.
Whole milk is the best selling product, followed by rolls/buns and other vegetables.

Scatter Plot for the given data (463 Rules)

The scatter plot() is a plot for visualising the association rules where the darkness demonstrates the lift, the x axis is the support and the y axis is the confidence.
This is a plot for the 463 rules extracted from the Dataset 2.
This demonstrates that most of the items have a support of less than 0.03.
It also shows that lift is maximum when the support is less.

Graph for top 50 Rules for Association Rules

The graph rules plot() is a plot where we can visualise the association rules easily.
The size of the bubble increases with the support while the colour darkens as the liftincreases.
The arrows here indicate what items are bought next to the previous item.
In this plot, sausage is bought after sliced cheese.
The range of support and lift is also given in the top right corner.

Parallel coordinates plot for 100 Rules

The Parallel Coordinates Plot() shows what products with what items produce what kind of sales.
This is a parallel coordinates plot for 100 rules from the database.
It shows that if someone buys berries, they are more likely to buy whipped/sour cream next and the darker colour shows that the confidence is high.

Grouped Matrix for 463 Rules

In this figure of grouped matrix plot(), the rules are represented as a grouped matrix-based visualisation.
It is a novel way of creating nested groups of rules (more specifically antecedent itemsets) via clustering.
The creation of the nested groups form a hierarchy which will be interactively explored to each individual rule.
The support and lift measures are represented by the size and color of the balloons, respectively.
In this case it’s not a very useful visualization, since we only have whipped/sour cream on the right-hand-side of the rules.

Final Words

Market basket analysis is an unsupervised machine learning technique that can be useful for finding patterns in transactional data. It can be a very powerful tool for analyzing the purchasing patterns of consumers.

The main algorithm used for market basket analysis is the apriori algorithm. The three statistical measures in market basket analysis are support, confidence, and lift.

Market basket analysis with the help of association rules can easily tell the customer buying behavior; and the retailer with the help of these concepts can easily setup his retail shop accordingly to expand the business in future.

Although Market Basket Analysis conjures up pictures of shopping carts and supermarket shoppers, it is important that it can be applied to:

Analysis of credit card purchases
Analysis of telephone calling patterns
Identification of fraudulent medical insurance claims
(Consider cases where common rules are broken)
Analysis of telecom service purchases

In this article, we examined the transactional patterns of grocery purchases and discovered both obvious and not-so-obvious patterns in certain transactions.

Finally, If you faced any difficulties, feel free to contact me for any doubts.

Predicting purchases with Market Basket Analysis

Create your own “Customers who bought this also bought” section using MBA with Association Rules

What is Market Basket Analysis?

Some Terminologies

Items

Transactions

Rules

Methodology

Association Rule Mining

Apriori Principle

Goals of Association Rule Mining

Steps for Market Basket Analysis using Association Rules

Data

Dataset 1

Dataset 2

Results

Dataset 1

Dataset 2

Final Words

Written by Shashank Singh