Fields Data
Published in

Fields Data

If you are buying bread, will you buy jam?

Have you ever wondered why you sometimes step out to a shop with a list of items in mind, but end up buying more than you had initially planned for? This certainly has something to do with consumers’ tendency to impulse buy, but could there be more to this? With every domain powered by machine learning, the short answer is yes.

Market Basket Analysis

In modern times, retail shopkeepers take advantage of people’s impulsive buying behavior by adopting machine learning. By using the Apriori Algorithm, for example, retailers are able to identify trends in customers’ buying patterns, and can use these to draw conclusions on what the customers want, thereby increasing their prospective sales. This technique to reveal associations between items is called Market Basket Analysis.

Example : A shopkeeper notices that milk and bread had a 60% increase in sales in the past few months. To increase profits, s/he decides to add a discount on jam so that the customer is tempted to also buy this third item, or s/he might simply place these items close to each other.

However, if the shopkeeper has a dataset of 100,000 transactions, it becomes more difficult to manually draw insights from these sales. This is precisely where Association Rule Mining is used.

Association Rule Mining (ARM)

ARM is used to find interesting associations or relationships among the items in a dataset. It is based on an “IF… THEN” relationship between the items. If a customer has bought item A, it predicts what the chances are that item B will also be bought by the customer in the same transaction. In doing so, a new rule is created for the item. Apriori Algorithm uses frequent itemsets to generate association rules that can be applied to large datasets, such as the aforementioned shopkeeper’s 100,000 transactions.

Rule Evaluation Metrics

Before we move on to the Apriori Algorithm itself, let’s have a look at the 3 important measures that form its basis, namely: support, confidence & lift.

Support : Support is the frequency of items bought, or the combination of items that are usually bought by customers.

Example : Frequency of bread bought by a customer.

Confidence : Confidence represents how often items X and Y are bought together, when the frequency of item X occurring is known.

Example : Frequency of customers buying milk after they have purchased bread, when the frequency of buying bread is known.

Lift : Lift represents the strength of the rule. The higher the value of lift, the greater the importance of that particular rule.

Example : Frequency of customers buying milk who have already purchased bread in the same transaction, while considering the popularity of buying milk alone.

Support, confidence and lift are the key measures used by Apriori Algorithm to uncover associations between items, in particular in the retail sector. Could we apply a similar concept to the humanitarian context?

As you can see from the example above, the Apriori Algorithm can indeed be used in the humanitarian sector too. The algorithm can help identify the various sectors that the humanitarian organizations are more likely to contribute in.

Apriori Algorithm

Apriori is the algorithm behind Market Basket Analysis, which is used for data mining. It finds the frequent itemsets and generates the above-mentioned association rules. Frequent itemsets are those for which the support and confidence values are greater than the threshold value. According to the Apriori algorithm, any subset of frequently bought items must also be frequent.

As an example, if organizations contributing in {Health, WASH, Food security} are frequent, then the organizations contributing in just {Health, WASH} must also be frequent.

Conclusion :

Market Basket Analysis finds out the underlying patterns in customers’ buying behaviors by finding associations between items. It is the basis of the ‘You may also like…’ or ‘Similar Products’ recommendations that you see on platforms like Amazon, Netflix, Flipkart or Spotify, to name a few. This technique is also used in the fields of education, forestry and medicine, and certainly has a larger scope in various other domains. So next time you recommend something to someone, make sure you have done your homework!



We envision a world in which all the organizations in the humanitarian and development sectors work together to reduce duplication, optimize resources and maximize their impacts. Staying true to this mission, we share our data science journey in this publication to do just that.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store