Using parallel coordinates to visualize rules
Your visual demonstration in the previous exercise convinced the founder that the supply-confidence border is worthy of further exploration. She now suggests that you extract part of the border and visualize it. Since the rules that fall on the border are strong with respect to most common metrics, she argues that you should simply visualize whether a rule exists, rather than the intensity of the rule according to some metric.
You realize that a parallel coordinates plot is ideal for such cases. The data has been imported for you as onehot
. Additionally, apriori()
, association_rules()
, and parallel_coordinates()
have been imported, and pandas
is available as pd
. The function rules_to_coordinates()
has been defined and is available.
This exercise is part of the course
Market Basket Analysis in Python
Exercise instructions
- Complete the Apriori algorithm statement using a minimum support of 0.05.
- Compute association rules using a minimum confidence threshold of 0.50. This is sufficiently high to exclusively capture points near the upper part of the supply-confidence border.
- Convert the rules into coordinates.
- Plot the coordinates using
parallel_coordinates()
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Compute the frequent itemsets
frequent_itemsets = ____(onehot, min_support = ____,
use_colnames = True, max_len = 2)
# Compute rules from the frequent itemsets with the confidence metric
rules = association_rules(frequent_itemsets, metric = '____',
min_threshold = 0.50)
# Convert rules into coordinates suitable for use in a parallel coordinates plot
coords = rules_to_coordinates(____)
# Generate parallel coordinates plot
parallel_coordinates(____, 'rule')
plt.legend([])
plt.show()