Parallel coordinates plot
1. Parallel coordinates plot
In this video, we'll add another visualization to our toolset: the parallel coordinates plot. The parallel coordinates plot will allow us to visualize whether a rule exists between an antecedent and consequent.2. What is a parallel coordinates plot?
So what is a parallel coordinates plot? We can think of it as a type of directed network diagram. That is, the plot shows connections between two objects that are related and indicates the direction of the relationship. The example parallel coordinates plot shown was constructed using association rules generated from the MovieLens dataset. Each row of the plot corresponds to a particular movie. And each line connects an antecedent on the left hand side with a consequent on the right hand side. Note that the labels on the consequent side of the plot are identical to those on the antecedent side. We can see that certain movies, such as The Dark Knight, are antecedents for many others. Other movies, such as Fight Club, are antecedents for only one. Parallel coordinates plots can be extended to incorporate multiple antecedents or consequents, but we'll focus on one-antecedent, one-consequent relationships.3. When to use parallel coordinate plots
When should we use a parallel coordinates plot, rather than a heatmap or a scatter plot? If we don't need information about intensity, only want information about whether a rule exists, and want a diagram with minimal visual clutter, it makes sense to use a parallel coordinates plot, rather than a heatmap. Alternatively, if we want information about individual rules, aren't interested in the metric thresholds that are used to identify the rules, and only want to examine the rules themselves, we'll want to use a parallel coordinates plot, rather than a scatterplot.4. Preparing the data
Let's generate a parallel coordinates plot. We'll start by importing association rules and apriori. We'll also import the one-hot encoded MovieLens data. Next, we'll apply the Apriori algorithm with a minimum support of 0-point-10. We'll also use a max length of 2, since we are only interested in rules with one antecedent and one consequent. We'll then generate the rules, setting the minimum threshold to 0-point-00 for support to avoid performing any further pruning.5. Converting rules to coordinates
We now need to transform the rules into a format that can be input into a parallel coordinates plot. We'll start by converting the frozen lists of antecedents and consequents to strings using lambda functions. We'll then create a column, rule, which sets a name for each rule equal to its index value. Finally, we'll define a DataFrame "coords" that contains the antecedent, consequent, and rule columns. In the exercises, we'll use a function called rules to coordinates, which takes the rules DataFrame as an input and outputs coordinates.6. Generating a parallel coordinates plot
We have now prepared the data and can generate a parallel coordinates plot. We first import the library parallel coordinates from the pandas plotting submodule. We need to specify the DataFrame that contains the coordinates and the column that contains the name of the rules. We can also specify a colormap.7. Generating a parallel coordinates plot
What can we learn from this plot? The Dark Knight and The Matrix seem to be strongly associated. This is also true of the movies in the Lord of the Rings trilogy. We might, however, think we were too restrictive with our support threshold.8. Refining a parallel coordinates plot
Let's try refining our pruning parameters. We'll lower the minimum itemset support to 0-point-10. Let's also discard rules that are unlikely to be good by pruning those with a lift below 1-point-00.9. Refining a parallel coordinates plot
This now gives us many more rules. We can see strong associations between Batman movies, Lord of the Rings Movies, and Star Wars movies.10. Let's practice!
We now know how to generate parallel coordinate plots. Let's practice in some exercises!Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.