Get startedGet started for free

“If this then that” with the apriori

1. “If this then that” with the apriori

As we have seen, the apriori algorithm extracts a set of rules from the transactional dataset. Let's make sure we understand the output.

2. Recap of extracted rules (1)

Back to the transactions from the grocery store. Let's apply the apriori function on these transactions with the given parameters.

3. Recap of extracted rules (2)

The set of extracted rules can be coerced to a dataframe and can be further filtered or processed in the same way you work with dataframes. Printing the dataframe displays all rules and their corresponding metrics: support, confidence, lift and count. Let's have a closer look at the first rule: it means that if you buy "bread" you are likely to buy "butter" as well. Recall that first of all the support indicates 42% of customers purchased bread and butter. The confidence of 1 means 100% of customers who bought “Bread” also bought “Butter”. Finally, a lift of 1.16 represents the 16% increase in expectation that a customer will buy Butter when we know he bought Bread.

4. Appearance of frequent itemsets

Remember that the apriori function enables us to retrieve both frequent itemsets and a set of extracted rules. Let's tailor the appearance of the R output, starting with the frequent itemsets. The "appearance" argument of the apriori function can be used to filter specific items. For instance, you can select frequent itemsets that are only related to the items "Cheese" and "Wine" using the "items" keyword you see in the appearance list. When inspecting the frequent itemsets, we retrieve all itemsets containing Cheese, Wine or both.

5. Appearance of extracted rules

Likewise, we can change the appearance of extracted rules. For instance, let's retrieve rules for which the item "Cheese" is the consequent of the rule. In the apriori function we set the right hand side to "Cheese" with R-H-S as keyword. By inspecting the rules, we obtain the set of extracted rules satisfying the list of parameters set in the apriori function. Both Wine and Butter imply Cheese. Does that look like a surprise to you?

6. Redundant rules

There are often too many association rules inferred from a transactional dataset. Some rules may be redundant with respect to other extracted rules in the sense that they do not provide extra knowledge with respect to some rules. But what exactly is a redundant rule? A rule is redundant if a more general rule with the same or a higher confidence exists. A more general rule is called a super-rule, it has the same RHS but one or more items removed from the LHS. For example, the two rules shown here are both super-rules of the rule A implies C. Finally, the set of non-redundant rules is defined as all rules that are not considered redundant.

7. Rule redundancy (1)

Let's see how to generate the set of non-redundant rules in R for some given parameters. The function "is.redundant" from the "arules" package allows to retrieve the set of redundant rules. By filtering out these rules, we obtain the set of non-redundant rules.

8. Rule redundancy (2)

Let's compare both sets of extracted rules, the original one and the non-redundant one. In this example, rule #2 and rule #3 are considered redundant rules as they provide no extra knowledge in addition to rule #1. "Butter implies Bread" is considered as a non-redundant rule given that rules #2 and #3 are super rules of rule #1, and both these rules have the same confidence as rule #1.

9. Let's follow the rules!

Next steps will be to visualize the set of extracted rules. But first it is your turn to work with rules! Make sure to follow the rules with the Online retail dataset.

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.