Advanced filtering with multiple metrics
Earlier, we used data from an online novelty gift store to find antecedents that could be used to promote a targeted consequent. Since the set of potential rules was large, we had to rely on the Apriori algorithm and multi-metric filtering to narrow it down. In this exercise, we'll examine the full set of rules and find a useful one, rather than targeting a particular antecedent.
Note that the data has been loaded, preprocessed, and one-hot encoded, and is available as onehot
. Additionally apriori()
and association_rules()
have been imported from mlxtend
. In this exercise, you'll apply the Apriori algorithm to identify frequent itemsets. You'll then recover the set of association rules from the itemsets and apply multi-metric filtering.
This exercise is part of the course
Market Basket Analysis in Python
Exercise instructions
- Apply the Apriori algorithm to the one-hot encoded itemsets with a minimum support threshold of 0.001.
- Extract association rules using a minimum support threshold of 0.001.
- Set the
antecedent_support
at 0.002 andconsequent_support
to 0.01. - Set
confidence
to be higher than 0.60 andlift
to be higher than 2.50.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Apply the Apriori algorithm with a minimum support threshold of 0.001
frequent_itemsets = ____(onehot, min_support = ____, use_colnames = True)
# Recover association rules using a minium support threshold of 0.001
rules = ____(frequent_itemsets, metric = '____', min_threshold = 0.001)
# Apply a 0.002 antecedent support threshold, 0.60 confidence threshold, and 2.50 lift threshold
filtered_rules = rules[(rules['antecedent support'] > ____) &
(____['consequent support'] > 0.01) &
(rules['____'] > ____) &
(____ > 2.50)]
# Print remaining rule
print(filtered_rules[['antecedents','consequents']])