Which products have your customers bought together that you don’t know about? — Association rules with Apriori

Association rules analysis from a bakery with python and Power BI

Gabriel Reversi
4 min readJan 14, 2023

Imagine, you are a man, married, and have a baby. Well, as the good father and husband that you are, you try to help your wife doing some things at home, beyond helping her to care for the baby, like a cook, cleaning the house, taking garbage, putting clothes to wash, and going to the supermarket on the weekend.

Finally arrive the Friday, you need to go to the supermarket to buy diapers. So, in the supermarket you see the hall of beers and think “as I don’t work tomorrow and I can awake a little later, I will buy a few beers to drink today and watch the game”. So, every Friday you enjoy you go to the supermarket to buy diapers and buy a few beers.

This is a behavior pattern of a group of people that happened in Walmart and was called “The case of diapers and beers”. Men that go to supermarkets buy diapers for their children and ended up buying beers.

Just like you can have a specific purchase pattern. I’ll use myself as an example. In my home, coincidentally the oliva oil ends in the same week that coffee powder, then I go to the supermarket and the most of time I buy both of them.

What is association rules learning?

After some examples above are easy to conclude that association rules are as casually, but not. The same example to the supermarket, when someone goes to the supermarket to buy bread we can see that have a lot of other products related to breakfast around, like cookies, cakes, cereals, and toast. It happens because these product placements are a result of associations based on previous customer transactions with the one motivation that a customer should not spend time searching for relevant items.

Association rule learning is a rule-based machine learning method for discovering interesting relations between variables in large databases. It is intended to identify strong rules discovered in databases using some measures of interestingness. In any given transaction with a variety of items, association rules are meant to discover the rules that determine how or why certain items are connected.

Using a bakery’s data as a study of case

Well, we will use the database from Kaggle to see how applying in a business.

Above we have the table of transactions, which means each item that each customer has bought. In transaction 2, the customer bought 2 products called “scandinavian”, transaction 3 the customer bought a hot chocolate and jam, and so on.

Depende on the kind of business, it can have a lot of products and not all are relevant because they do not have too many sales. So we need to define a parameter called “support”, which is the frequency of items sold.

Here we have just items that have a support of 1% or more.

So, the next step is applying the Apriori model with the metric “Lift”. This metric that we need to pass to the algorithm basically tells us that the likelihood of buying Toast and coffee together is 1.47 times more than the likelihood of just buying the coffee. Bellow, we have the result.

This is our final result, a table that shows us which products tend to be bought together. Notice that the table result shows us some fields like the confidence that refer to the likelihood that an item B is also bought if item A is bought.

After that, I did build a dashboard prototype using the tool Adobe XD to before of be built in Power BI.

What did you think? Tell me in the comments.

This was my first case of study using apriori with this technical association rule. However, I confess that I really liked this. I think this kind of analysis like the behavior of the customer, purchases, and analysis of products very interesting.

If you liked and think it interesting to do this kind of analysis, let me know in the comments. I’ll go deeper into this subject and share more articles like it.

Ah, if you already worked on this kind of project and have some tips for me, please let me know too, I would like to learn more from someone that already done a real project.

Thank you to read :)

To access the code on the GitHub repository, click here.

References

Association rule learning — Wikipedia
A Conceptual Introduction into Association Rule Mining — Part 1 | by Annette Catherine Paul | delvify | Medium
The Bread Basket | Kaggle

--

--

Gabriel Reversi

Hi, I'm data analyst and data scientist. Here I share content about data, tools, methods and business. https://www.linkedin.com/in/gabrielreversi/