Exercise

Selecting a support threshold

The manager of the online gift store looks at the results you provided from the previous exercise and commends you for the good work. She does, however, raise an issue: all of the itemsets you identified contain only one item. She asks whether it would be possible to use a less restrictive rule and to generate more itemsets, possibly including those with multiple items.

After agreeing to do this, you think about what might explain the lack of itemsets with more than 1 item. It can't be the max_len parameter, since that was set to three. You decide it must be support and decide to test two different values, each time checking how many additional itemsets are generated. Note that pandas is available as pd and the one-hot encoded data is available as onehot.

Instructions 1/2

undefined XP
    1
    2
  • Complete the import statement for the apriori algorithm.
  • For frequent_itemsets_1, set the min support to 0.003 and the maximum length to 3.
  • For frequent_itemsets_2, set the min support to 0.001 and the maximum length to 3.