Selecting a support threshold
The manager of the online gift store looks at the results you provided from the previous exercise and commends you for the good work. She does, however, raise an issue: all of the itemsets you identified contain only one item. She asks whether it would be possible to use a less restrictive rule and to generate more itemsets, possibly including those with multiple items.
After agreeing to do this, you think about what might explain the lack of itemsets with more than 1 item. It can't be the max_len
parameter, since that was set to three. You decide it must be support and decide to test two different values, each time checking how many additional itemsets are generated. Note that pandas
is available as pd
and the one-hot encoded data is available as onehot
.
Diese Übung ist Teil des Kurses
Market Basket Analysis in Python
Interaktive Übung
Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.
# Import apriori from mlxtend
from mlxtend.____ import ____
# Compute frequent itemsets using a support of 0.003 and length of 3
frequent_itemsets_1 = apriori(onehot, min_support = ____,
max_len = ____, use_colnames = True)
# Compute frequent itemsets using a support of 0.001 and length of 3
frequent_itemsets_2 = apriori(onehot, min_support = ____,
____, use_colnames = True)
# Print the number of freqeuent itemsets
print(len(frequent_itemsets_1), len(frequent_itemsets_2))