Ottimalità del confine supporto-confidenza

Torni dalla fondatrice con lo scatter plot prodotto nell'esercizio precedente e le chiedi se vuole che tu usi il pruning per recuperare il confine supporto-confidenza. Le parli del risultato di Bayardo-Agrawal, ma lei è scettica e ti chiede se puoi dimostrarlo con un esempio.

Ricordando che gli scatter plot possono scalare la dimensione dei punti in base a una terza metrica, decidi di usarlo per dimostrare l'ottimalità del confine supporto-confidenza. Lo mostrerai scalando la dimensione dei punti usando la metrica lift, una di quelle a cui si applica Bayardo-Agrawal. I dati one-hot encoded sono già stati importati e sono disponibili come onehot. Inoltre, apriori() e association_rules() sono stati importati e pandas è disponibile come pd.

Questo esercizio fa parte del corso

Analisi del carrello in Python

Visualizza il corso

Istruzioni dell'esercizio

Applica l'algoritmo Apriori al DataFrame onehot.
Calcola le regole di associazione usando la metrica support e una soglia minima di 0.0.
Completa l'espressione per lo scatter plot in modo che la dimensione dei punti sia scalata da lift.

Esercizio pratico interattivo

Prova a risolvere questo esercizio completando il codice di esempio.

# Import seaborn under its standard alias
import seaborn as sns

# Apply the Apriori algorithm with a support value of 0.0075
frequent_itemsets = ____(____, min_support = 0.0075, 
                         use_colnames = True, max_len = 2)

# Generate association rules without performing additional pruning
rules = ____(frequent_itemsets, metric = "support", 
                          min_threshold = ____)

# Generate scatterplot using support and confidence
sns.scatterplot(x = "support", y = "confidence", 
                size = "____", data = rules)
plt.show()

Modifica ed esegui il codice

Questo esercizio fa parte del corso

Analisi del carrello in Python

IntermediárioNível de habilidade

4.9+

Inizia il corso gratis

In this chapter, you’ll learn the basics of Market Basket Analysis: association rules, metrics, and pruning. You’ll then apply these concepts to help a small grocery store improve its promotional and product placement efforts.

Exercise 1: What is market basket analysis?Exercise 2: The basics of market basket analysis Exercise 3: Cross-selling products Exercise 4: Identifying association rules Exercise 5: Multiple antecedents and consequents Exercise 6: Preparing data for market basket analysis Exercise 7: Generating association rules Exercise 8: The simplest metric Exercise 9: One-hot encoding transaction data Exercise 10: Computing the support metric

Association rules tell us that two or more items are related. Metrics allow us to quantify the usefulness of those relationships. In this chapter, you’ll apply six metrics to evaluate association rules: supply, confidence, lift, conviction, leverage, and Zhang's metric. You’ll then use association rules and metrics to assist a library and an e-book seller.

Exercise 1: Confidence and lift Exercise 2: Recommending books with support Exercise 3: Refining support with confidence Exercise 4: Further refinement with lift Exercise 5: Leverage and conviction Exercise 6: Lift versus leverage Exercise 7: Computing conviction Exercise 8: Computing conviction with a function Exercise 9: Promoting ebooks with conviction Exercise 10: Association and dissociation Exercise 11: Computing association and dissociation Exercise 12: Defining Zhang's metric Exercise 13: Applying Zhang's metric Exercise 14: Advanced rules Exercise 15: Filtering with support and conviction Exercise 16: Using multi-metric filtering to cross-promote books

The fundamental problem of Market Basket Analysis is determining how to translate vast amounts of customer decisions into a small number of useful rules. This process typically starts with the application of the Apriori algorithm and involves the use of additional strategies, such as pruning and aggregation. In this chapter, you’ll learn how to use these methods and will ultimately apply them in exercises where you assist a retailer in selecting a physical store layout and performing product cross-promotions.

Exercise 1: Aggregation Exercise 2: Performing aggregation Exercise 3: Defining an aggregation function Exercise 4: The Apriori algorithm Exercise 5: Pruning and Apriori Exercise 6: Identifying frequent itemsets with Apriori Exercise 7: Selecting a support threshold Exercise 8: Basic Apriori results pruning Exercise 9: Generating association rules Exercise 10: Pruning with lift Exercise 11: Pruning with confidence Exercise 12: Advanced Apriori results pruning Exercise 13: Aggregation and filtering Exercise 14: Applying Zhang's rule Exercise 15: Advanced filtering with multiple metrics

In this final chapter, you’ll learn how visualizations are used to guide the pruning process and summarize final results, which will typically take the form of itemsets or rules. You’ll master the three most useful visualizations -- heatmaps, scatterplots, and parallel coordinates plots – and will apply them to assist a movie streaming service.

Exercise 1: Heatmap Exercise 2: Visualizzazione del supporto degli itemset Exercise 3: Heatmap con lift Exercise 4: Interpretare le heatmap Exercise 5: Scatter plot Exercise 6: Potatura con scatter plot Exercise 7: Ottimalità del confine supporto-confidenza

Esercizio in corso

Exercise 8: Grafico a coordinate parallele Exercise 9: Usare le coordinate parallele per visualizzare le regole Exercise 10: Raffinare un grafico a coordinate parallele Exercise 11: Congratulazioni!