Search⌘ K
AI Features

Apriori Algorithm and Association Rules

Explore the Apriori algorithm and association rule mining to uncover meaningful relationships within datasets. Understand key metrics such as support, confidence, and lift that evaluate these rules, and see how these concepts apply in market basket analysis and other domains. This lesson helps you grasp foundational unsupervised learning techniques for pattern discovery.

Association rule mining

Association rule mining helps us find rules and relationships in the dataset. It works for both relational databases and transactional databases. It is also used to find the correlated features with each other. An association rule has two parts:

  • An antecedent is found in the dataset at hand.
  • A consequent is found by using the antecedent.

One such example of an association rule is:


Antecedent>Consequent{Antecedent} -> {Consequent} <br> Bread>Butter{Bread} -> {Butter} <br> X>YX -> Y


XX and YY are called antecedents and consequents, respectively. It can be read as: people who buy bread are also likely to buy butter. “Bread” and “butter” are the items. This rule has been deduced from a dataset. This rule can help companies increase revenue and make smart decisions based on it.

Metrics for evaluating association rules

There are various metrics involved in evaluating the interest of association rules. Association rules are carefully derived from the dataset. Let’s consider the following transactional table:

Transactional ID Items
1 Bread, Milk
2 Bread, Diaper, Bear Eggs
3 Milk, Diaper, Beer, Coke
4 Bread, Milk, Diaper, Beer
5 Bread, Milk, Diaper, Coke

Support

Support indicates how frequent or popular an itemset is, measured by the proportion of transactions in which it appears. It is a value between 0 and 1. Values closer to 1 show that itemsets occur more frequently in the dataset. We refer to an itemset as a frequent itemset if its support exceeds a specified minimum-support threshold. In the above table, we have:

Support{Beer}=35Support\{Beer\}=\frac{3}{5}

There are a total of five transactions, and out of those, three have the item beer appearing in them.

Support{Milk,Coke}=25Support\{Milk, Coke\}= \frac{2}{5} ...