Optimality of the support-confidence border

You return to the founder with the scatterplot produced in the previous exercise and ask whether she would like you to use pruning to recover the support-confidence border. You tell her about the Bayardo-Agrawal result, but she seems skeptical and asks whether you can demonstrate this in an example.

Recalling that scatterplots can scale the size of dots according to a third metric, you decide to use that to demonstrate optimality of the support-confidence border. You will show this by scaling the dot size using the lift metric, which was one of the metrics to which Bayardo-Agrawal applies. The one-hot encoded data has been imported for you and is available as onehot. Additionally, apriori() and association_rules() have been imported and pandas is available as pd.

Apply the Apriori algorithm to the DataFrame onehot.
Compute the association rules using the support metric and a minimum threshold of 0.0.
Complete the expression for the scatterplot such that the dot size is scaled by lift.

Introduction to Market Basket Analysis

Association Rules

Aggregation and Pruning

Visualizing Rules

Exercise

Optimality of the support-confidence border

Instructions