Dan Isaza
Weekly Data Science
1 min readJul 12, 2018

--

Thanks for reading, Sarah!

Choosing a support threshold is contextual, and there are several things that may factor into the decision. Typically, the threshold is chosen so that the resulting number of frequent itemsets is both meaningful and manageable— whatever that may mean in your particular context.

It’s common to express the support threshold as a percentage of the number of baskets — and typically a threshold of less than 1% is chosen.

That said, you may have to tune the threshold to make sure you have a manageable volume of results for your application. Let’s say, for example, that we want to rearrange the items on the shelves of a brick and mortar store to increase sales. Since moving physical inventory is time consuming and expensive, we’ll likely limit the number of association rules we seek to only a few dozen. In contrast to that, an eCommerce website may be able to reasonably use insights from thousands of association rules, since the cost of “rearranging shelves” is extremely low — it might entail showing different products right before the checkout step, for example.

Depending on which context we find ourselves in, we can tune the threshold to ensure that the resulting number of frequent itemsets is reasonable for our use case.

(Also, for more insight into what makes association rules meaningful, check out my post, Finding Meaningful Associations in Retail Data)

--

--

Dan Isaza
Weekly Data Science

Stanford Math & CS | VP of Engineering at Clever Real Estate | (he/him pronouns)