Apriori Algorithm

Sanskar wagavkar
Analytics Vidhya
Published in
4 min readDec 19, 2020

Apriori Algorithm is Machine Learning Algorithm which is use for mining frequent item-set and to create an Association rules from the transaction data-set.

It is mostly use for recommendation purpose.

Consider which you visit a supermarket like (Bigbazzar, Dmart, Reliance store, etc.) you will see there are so many offers are going on Buy-one get one free or Buy-two get one free for example: — if you buy one sanitizer you will get one soap free in that case Apriori Algorithm came. With the help of a Apriori Algorithm we can recommend the product which is highly associate with any other product.

Association Rules

Association rules is the technique which is use to check how our items are highly associated with each other.

When you buy product A then how much change you will buy product B.

For example: — if you buy milk from the store then how much change the people will buy the paneer or butter.

To check association between them we use Association rule.

In association rule there are three concepts came: —

1. Support

2. Confidence

3. Lift

Support: -With the help of a support we can calculate how popular an item- set is.

To calculate support, we have a formula

Support(A) =

(number of transactions in which A appears)/(Total of transactions)

Confidence: -With the of a Confidence we can calculate the percentage how likely item Y is purchased when item X is purchased.

To calculate Confidence, we have a formula

Confidence (A->B) = Support(AUB)/Support(A)

Lift: -is the ratio between confidence and except confidence

To calculate Lift, we have a formula

Lift = Support(A->B)/Support(A)*Support(B)

Working of Apriori Algorithm

Consider a Market Basket Transaction Data-set Below. Find which item has a strong Association with each other.

STEP 1: -

Here,

Support = 50%

Confidence = 60%

Support = 50% = 0.5*6 = 3

STEP 2: -

Now support = 3

STEP 3: -

Here,

We have Support = 3

In above table if the count is less than 3. Than we remove that item.

STEP 4: -

Now we follow same steps which we follow with 2 items

STEP 5: -

Here,

Support = 3 Now, we remove the item which count is less than 3 here (l1, l4), (l3, l4) count is 2, 2 so we remove that item.

From above table we can say that l1, l2, l3 is frequent.

STEP 6: -

Now we are ready to apply Association Rule

Confidence = 60%

1. l1, l2, => l3

Confidence =3/4*100

= 75%

2. l1, l3 => l2

Confidence =3/4*100

= 75%

3. l2, l3 => l1

Confidence =3/4*100

= 75%

Now we can say that l1, l2, l3 have a strong association rule.

Application

1. We can use this in super-market from recommendation.

2. We can use this in E-commence.

3. We can use this in software industry.

4. We can use this for marketing purpose as well.

Advantages and Disadvantages

Advantages

1. Easy to understand algorithm

2. Easy to implement on large item-set in large data-base

Disadvantages

1. The entire data-base needs to be scanned

2. It requires highly computation power

Conclusion

With the help of an apriori algorithm we can see how product are associate with each other with help of an association rule and recommend them a product.

--

--