3.1) Introduction to Association Rules

Hello! This article is a part of a full tutorial of Recommendation Systems which you can find here. I strongly recommend to follow the agenda of this tutorial if you want to master this topic.

Basic Concepts of Association Rules

In the previous article we have already seen some examples of association rules using in e-commerce. We know that association rule mining is a popular and well researched method of data mining for discovering interesting relations between items in the databases.

The goal of association rule mining is to find rules that will predict the occurrence of an item (Item Y) based on the occurrence of other items (Item X) in the transaction.

For example: Predict the chance of user buying a phone cover (Item Y) if he already bought the phone (Item X) and if the chance is high enough then recommend phone cover to someone who are buying the phone. There is a big chance to discover strong rules in big data, but keep in mind that the implication means co-occurrence does not necessarily means causality! We can not assist that buying one item is cause of buying the other one when items are just frequently bought together.

Association rule suggests to add phone cover in your cart if you are buying the phone

Besides market basket data, association analysis is also applicable to other application domains such as bioinformatics, medical diagnosis, web mining and scientific data analysis. In the analysis of Earth science data, association patterns may reveal interesting connections among the ocean, land, and atmospheric processes. We will focus on market basket data.

Now let’s study the basic concept of association rules. Here is our data which consist of 5 transactions made by our customer. Each transaction shows the products bought together in that transaction.

Transactions data set including 5 transactions made by our customers
  • Binary Representation of Market Basket Data- Market basket data can be represented in a binary format as shown in table, where each row corresponds to a transaction and each column correspond to an item. An item can be treated as binary variable whose value is 1 if the item is present in a transaction and 0 otherwise. The representation is a simplified view of real market data because it ignores many information about the data such as quantity of items sold or the price paid to purchase them.
A binary 0/1 representation of market basket data
  • Item-set: It’s a collection of one or more items. K-item-set means a set of k items. For example: Item-set is {Bread, Milk}
  • k-itemset: An itemset that contains k items. For example: {Bread, Milk} is 2-itemset
  • Support Count: Indication of how frequently the item set appears in the database. It’s frequency of occurrence of an item-set. For example: {Bread, Milk} occurs 3 times in our data set
  • Support: Fraction of transactions that contain the item-set. Support=Frequency of Itemset/Total N of Transactions. For example: Support for {Bread, Milk} = 3/5=60%, it means that 60% of the transactions contain itemset {Bread, Milk}
  • Confidence: For a rule X=>Y confidence shows the percentage in which X is bought with Y. So confidence is the number of transactions with both X and Y divided by the total number of transactions having X. For example: Confidence for Bread => Milk = 3 / 4 = 75%, it means that 75% of the transactions that contain X (Bread) also contain Y (Milk) together. Confidence (X=>Y) = P(X∩Y)/P(X)=Frequency(X,Y)/frequency(X). Suppose, the confidence of the association rule X⇒Y is 80%, it means that 80% of the transactions that contain X also contain Y together
  • Form of Association Rule: X=> Y [Support, Confidence], where X and Y are sets of items in the transaction data. For example: Bread => Milk [Support=60%, Confidence= 75%], where support shows that in 60% of transactions bread and milk are purchased together, confidence shows that 75% of customers who purchase bread also purchase milk
  • Thresholds: While creating association rules you can set minimum threshold for support and confidence to filter out some non-interesting rules and keep only frequent item-sets and strong rules
  • Frequent item-set: An itemset whose support is greater than or equal to a minimum _support threshold
  • Strong rules: If a rule X=>Y [Support, Confidence] satisfies min_sup and min_confidence then it is a strong rule
  • The Goal of Association Rule Mining: The goal of association rule mining is to find all association rules having support≥minimum_support threshold and confidence ≥ minimum_confidence threshold
  • Lift: Lift gives the correlation between X and Y in the rule X=>Y. Correlation shows how one item-set X effects the item-set Y. Lift(X=>Y) = Confidence of the rule (X=>Y)/ Support(Y)
  • Lift for the rule {Bread}=>{Milk}: Confidence of the rule (75%) / Support (Milk) = 4/5 (80 %) = 75%/80 % = 93.75%

Evaluate the rule using the value of the Lift:

  1. If the rule had a lift of 1, then X and Y are independent and no rule can be derived from them
  2. If the lift is < 1, then presence of X will have negative effect on Y
  3. If the lift is > 1, then X and Y are dependent on each other, and the degree of which is given by lift value

Why use support and confidence?

Support and Confidence measure how interesting the rule is. Support is an important measure because a rule that has very low support may occur simply by chance. A low support rule is also likely to be uninteresting from a business perspective because it may not be profitable to promote items that customers seldom buy together. For these reasons support is often used to eliminate uninteresting rules. Support is also used for efficient discovery of association rules.

Confidence, on the other hand, measures the reliability of the inference made by a rule. For a given rule X->Y, the higher the confidence, the more likely it is for Y to be present in transactions that contain X. Confidence also provides an estimate of the conditional probability of Y given X.

Association rules suggest strong co-occurrence relationship between items and does not necessarily imply causality.

Basic Concepts of Association Rules

An intuitive example

Example of movie data set

This data set shows the movies that each user likes. Our mission is to discover rules about people’s taste in movies, so we are looking for frequent pairs. For example: Without any calculation we see here that “People who watched Movie_1 also watched Movie_2 “.

Confidence

Green -people who have seen Interstellar; Green with red circle - Interstellar and Inception

We are testing the rule: “ Interstellar”(X) = > “Inception”(Y) , This rule means that ”People who have seen Interstellar (X) like to see Inception (Y)”.

  • Green are people who have seen Interstellar (X) = 40
  • Green with red circle are people who have seen Interstellar (X) and Inception (Y) = 7
  • Confidence = Frequency(X,Y) / Frequency (X)
  • Confidence for Interstellar (X) = > Inception (Y) = 7/40 = 17.5%
  • Meaning: 17.5% of People likes Inception out of people who have seen Interstellar

Support

People with red cycles have seen Inception out of total 100 people

Lets calculate the support for “Inception”:

  • We have here 100 people and 10 of them have seen Inception (In red circle)
  • Support X = frequency X / total N of data
  • Support for Inception = 10/100 = 10%
  • Meaning: This means that if we randomly suggest people to watch “Inception“, probability that they like it is 10%

Summary of an Example

  • As we already know the support for Inception = 10%. Which means that if we randomly suggest people to watch “Inception“, probability that they like it is 10%
  • If we suggest to see Inception only those people who have seen “Interstellar”, probability that they like it is 17.5% (Confidence for rule Interstellar = > Inception =17.5 %)
  • Lift is improvement in recommendation = Confidence of the rule/Support of Y
  • Lift for “Inception” = 17.5%/10% = 1.75%
  • If lift is less than 1 it means that having X in the cart reduces the probability of having Y in this cart. In other words a value of lift less than 1 shows that having X on the cart does not increase the chances of occurrence of Y on the cart in spite of the rule showing a high confidence value
  • Value of lift greater than 1 proves for high association between {X} and {Y}. More the value of lift, greater are the chances of preference to buy {Y} if the customer has already bought {X}. Lift is the measure that will help store managers to decide product placements

After a short break continue reading to study association rules mining using Apriori and Eclat algorithms. Click here.

--

--