Advanced filtering with multiple metrics
Earlier, we used data from an online novelty gift store to find antecedents that could be used to promote a targeted consequent. Since the set of potential rules was large, we had to rely on the Apriori algorithm and multi-metric filtering to narrow it down. In this exercise, we'll examine the full set of rules and find a useful one, rather than targeting a particular antecedent.
Note that the data has been loaded, preprocessed, and one-hot encoded, and is available as onehot. Additionally apriori() and association_rules() have been imported from mlxtend. In this exercise, you'll apply the Apriori algorithm to identify frequent itemsets. You'll then recover the set of association rules from the itemsets and apply multi-metric filtering.
This exercise is part of the course
Market Basket Analysis in Python
Exercise instructions
- Apply the Apriori algorithm to the one-hot encoded itemsets with a minimum support threshold of 0.001.
- Extract association rules using a minimum support threshold of 0.001.
- Set the
antecedent_supportat 0.002 andconsequent_supportto 0.01. - Set
confidenceto be higher than 0.60 andliftto be higher than 2.50.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Apply the Apriori algorithm with a minimum support threshold of 0.001
frequent_itemsets = ____(onehot, min_support = ____, use_colnames = True)
# Recover association rules using a minium support threshold of 0.001
rules = ____(frequent_itemsets, metric = '____', min_threshold = 0.001)
# Apply a 0.002 antecedent support threshold, 0.60 confidence threshold, and 2.50 lift threshold
filtered_rules = rules[(rules['antecedent support'] > ____) &
(____['consequent support'] > 0.01) &
(rules['____'] > ____) &
(____ > 2.50)]
# Print remaining rule
print(filtered_rules[['antecedents','consequents']])