Pruning with scatterplots
After viewing your Batman-based streaming service proposal from the previous exercise, the founder realizes that her initial plan may have been too narrow. Rather than focusing on initial titles, she asks you to focus on general patterns in the association rules and then perform pruning accordingly. Your goal should be to identify a large set of strong associations.
Fortunately, you've just learned how to generate scatterplots. You decide to start by plotting support and confidence, since all optimal rules according to many common metrics are located on the confidence-support border. The one-hot encoded data has been imported for you and is available as onehot
. Additionally, apriori()
and association_rules()
have been imported and pandas
is available as pd
.
Este exercício faz parte do curso
Market Basket Analysis in Python
Instruções do exercício
- Generate a large number of itemsets with 2 items by setting the minimum support to 0.0075 and setting the maximum length to 2.
- Complete the statement for
association_rules()
in a way that avoids additional filtering. - Complete the statement to generate the scatterplot, setting the
y
variable to useconfidence
.
Exercício interativo prático
Experimente este exercício completando este código de exemplo.
# Import seaborn under its standard alias
import seaborn as sns
# Apply the Apriori algorithm with a support value of 0.0075
frequent_itemsets = apriori(onehot, min_support = ___,
use_colnames = True, max_len = ____)
# Generate association rules without performing additional pruning
rules = association_rules(____, metric = 'support',
min_threshold = ____)
# Generate scatterplot using support and confidence
sns.scatterplot(x = "support", y = "____", data = ____)
plt.show()