1. Aprender
  2. /
  3. Cursos
  4. /
  5. Market Basket Analysis in Python
  • 1

    Introduction to Market Basket Analysis

    Gratuito

    In this chapter, you’ll learn the basics of Market Basket Analysis: association rules, metrics, and pruning. You’ll then apply these concepts to help a small grocery store improve its promotional and product placement efforts.

  • 2

    Association Rules

    Association rules tell us that two or more items are related. Metrics allow us to quantify the usefulness of those relationships. In this chapter, you’ll apply six metrics to evaluate association rules: supply, confidence, lift, conviction, leverage, and Zhang's metric. You’ll then use association rules and metrics to assist a library and an e-book seller.

  • 3

    Aggregation and Pruning

    The fundamental problem of Market Basket Analysis is determining how to translate vast amounts of customer decisions into a small number of useful rules. This process typically starts with the application of the Apriori algorithm and involves the use of additional strategies, such as pruning and aggregation. In this chapter, you’ll learn how to use these methods and will ultimately apply them in exercises where you assist a retailer in selecting a physical store layout and performing product cross-promotions.

  • 4

    Visualizing Rules

    In this final chapter, you’ll learn how visualizations are used to guide the pruning process and summarize final results, which will typically take the form of itemsets or rules. You’ll master the three most useful visualizations -- heatmaps, scatterplots, and parallel coordinates plots – and will apply them to assist a movie streaming service.

    • Heatmaps50 XP
    • Visualizing itemset support100 XP
    • Heatmaps with lift100 XP
    • Interpreting heatmaps50 XP
    • Scatterplots50 XP
    • Pruning with scatterplots100 XP
    • Optimality of the support-confidence border100 XP
    • Parallel coordinates plot50 XP
    • Using parallel coordinates to visualize rules100 XP
    • Refining a parallel coordinates plot100 XP
    • Congratulations!50 XP

Exercicio

Exercicio

Optimality of the support-confidence border

You return to the founder with the scatterplot produced in the previous exercise and ask whether she would like you to use pruning to recover the support-confidence border. You tell her about the Bayardo-Agrawal result, but she seems skeptical and asks whether you can demonstrate this in an example.

Recalling that scatterplots can scale the size of dots according to a third metric, you decide to use that to demonstrate optimality of the support-confidence border. You will show this by scaling the dot size using the lift metric, which was one of the metrics to which Bayardo-Agrawal applies. The one-hot encoded data has been imported for you and is available as onehot. Additionally, apriori() and association_rules() have been imported and pandas is available as pd.

Instruções

100 XP
  • Apply the Apriori algorithm to the DataFrame onehot.
  • Compute the association rules using the support metric and a minimum threshold of 0.0.
  • Complete the expression for the scatterplot such that the dot size is scaled by lift.