Selecting a support threshold
The manager of the online gift store looks at the results you provided from the previous exercise and commends you for the good work. She does, however, raise an issue: all of the itemsets you identified contain only one item. She asks whether it would be possible to use a less restrictive rule and to generate more itemsets, possibly including those with multiple items.
After agreeing to do this, you think about what might explain the lack of itemsets with more than 1 item. It can't be the max_len
parameter, since that was set to three. You decide it must be support and decide to test two different values, each time checking how many additional itemsets are generated. Note that pandas
is available as pd
and the one-hot encoded data is available as onehot
.
This exercise is part of the course
Market Basket Analysis in Python
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Import apriori from mlxtend
from mlxtend.____ import ____
# Compute frequent itemsets using a support of 0.003 and length of 3
frequent_itemsets_1 = apriori(onehot, min_support = ____,
max_len = ____, use_colnames = True)
# Compute frequent itemsets using a support of 0.001 and length of 3
frequent_itemsets_2 = apriori(onehot, min_support = ____,
____, use_colnames = True)
# Print the number of freqeuent itemsets
print(len(frequent_itemsets_1), len(frequent_itemsets_2))