Session Ready
Exercise

Optimality of the support-confidence border

You return to the founder with the scatterplot produced in the previous exercise and ask whether she would like you to use pruning to recover the support-confidence border. You tell her about the Bayardo-Agrawal result, but she seems skeptical and asks whether you can demonstrate this in an example.

Recalling that scatterplots can scale the size of dots according to a third metric, you decide to use that to demonstrate optimality of the support-confidence border. You will show this by scaling the dot size using the lift metric, which was one of the metrics to which Bayardo-Agrawal applies. The one-hot encoded data has been imported for you and is available as onehot. Additionally, apriori() and association_rules() have been imported and pandas is available as pd.

Instructions
100 XP
  • Apply the Apriori algorithm to the DataFrame onehot.
  • Compute the association rules using the support metric and a minimum threshold of 0.0.
  • Complete the expression for the scatterplot such that the dot size is scaled by lift.