Applying Zhang's rule
In Chapter 2, we learned that Zhang's rule is a continuous measure of association between two items that takes values in the [-1,+1] interval. A -1 value indicates a perfectly negative association and a +1 value indicates a perfectly positive association. In this exercise, you'll determine whether Zhang's rule can be used to refine a set of rules a gift store is currently using to promote products.
Note that the frequent itemsets have been computed for you and are available as frequent_itemsets
. Additionally, zhangs_rule()
has been defined and association_rules()
have been imported from mlxtend
. You will start by re-computing the original set of rules. After that, you will apply Zhang's metric to select only those rules with a high and positive association.
This exercise is part of the course
Market Basket Analysis in Python
Exercise instructions
- Generate the set of association rules with a lift value of at least 1.00.
- Set the antecedent support threshold to 0.005.
- Compute Zhang's rule and assign the output to the column
zhang
inrules
. - Select the rules that have a Zhang's metric that is greater than 0.98.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Generate the initial set of rules using a minimum lift of 1.00
rules = association_rules(frequent_itemsets, metric = "____", min_threshold = ____)
# Set antecedent support to 0.005
rules = rules[rules['____'] > 0.005]
# Set consequent support to 0.005
rules = rules[rules['consequent support'] > 0.005]
# Compute Zhang's rule
rules['zhang'] = ____(____)
# Set the lower bound for Zhang's rule to 0.98
rules = rules[____['zhang'] > 0.98]
print(rules[['antecedents', 'consequents']])