Advanced rules

1. Advanced rules

In the previous exercise, we went beyond the one-off computation of metrics to applying Zhang's metric to a list of itemsets. That was a preview of where the rest of the course is headed, starting with this lesson.

2. Overview of market basket analysis

Going forward, we will typically apply a three-step procedure whenever we do market basket analysis. We'll start off by generating a large set of rules. We'll then filter or prune those rules using metrics. Finally, we'll typically be left with more than one rule, forcing us to use some intuition and common sense before making a recommendation.

3. Generating rules

In chapter 1, we discussed how to generate rules. We saw that the number of rules grew exponentially in the number of items, but most of these rules were not useful. We'll typically deal with this issue by applying an initial round of filtering, often using support. We'll demonstrate how to do that in the following chapter using the Apriori algorithm. In this video and the exercises that follow, we'll be given a set of rules to investigate. We'll then filter those rules using multiple metrics.

4. How does filtering work?

Filtering, which we'll also refer to as pruning, works by removing rules that perform poorly according to some metric. We can see how this works through the table above. Notice that it is similar to the rules DataFrame from the previous exercise, but contains multiple metrics. According to the table, the rule "if Harry Potter then The Hunger Games" has a relatively low support and might be excluded. Similarly, the rule "if The Hobbit then Twilight" has a lift below 1, which suggests that we may want to discard it.

5. Multi-metric filtering

Multi-metric filtering is simply the application of filtering rules that depend on multiple metrics. For instance, we might require a rule to have a support of at least 0-point-02 and a Zhang's metric of at least 0-point-05. Alternatively, we might require a rule to have either a high confidence level or a high conviction level.

6. Performing multi-metric filtering

Now, how do we apply multi-metric filtering? Let's step back to our ebook start-up example, where the founder has approached us for advice. In the previous exercise, she asked us to make use of a list of itemsets. She has now given us a DataFrame called "rules," which contains the work of a data scientist who was previously on staff. After applying the head method and printing, we notice that the DataFrame looks similar to the table on the previous page. That is, it contains columns for antecedents and consequents, and a number of columns of metrics.

7. Performing multi-metric filtering

Now, let's say the founder has asked us to find useful rules to promote books or sets of books that are infrequently sold. What would we do? Well, the first step would be to select only the rules that contain low consequent support itemsets. Doing this will reduce the number of rules from 149 to 12. We might want to filter further to also select useful rules by retaining only those that have a lift of greater than 1.5. This brings us to just two rules, both of which involve promoting the same books, but using different antecedents. And that's it: we've performed multi-metric filtering to identify a narrow subset of useful rules.

8. Let's practice!

We now know how to perform multi-metric filtering. Let's apply it in some exercises.

Create Your Free Account

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.

Market Basket Analysis in Python

IntermediateSkill Level

4.9+

100 reviews