Dendrogram aesthetics
So you made a dendrogram…but it's not as eye-catching as you had hoped!
The dendextend package can help your audience by coloring branches and outlining clusters. dendextend is designed to operate on dendrogram objects, so you'll have to change the hierarchical cluster from hclust using as.dendrogram().
A good way to review the terms in your dendrogram is with the labels() function. It will print all terms of the dendrogram. To highlight specific branches, use branches_attr_by_labels(). First, pass in the dendrogram object, then a vector of terms as in c("data", "camp"). Lastly, add a color such as "blue".
After you make your plot, you can call out clusters with rect.dendrogram(). This adds rectangles for each cluster. The first argument to rect.dendrogram() is the dendrogram, followed by the number of clusters (k). You can also pass a border argument specifying what color you want the rectangles to be (e.g. "green").
This exercise is part of the course
Text Mining with Bag-of-Words in R
Exercise instructions
The dendextend package has been loaded for you, and a hierarchical cluster object, hc, was created from tweets_dist.
- Create
hcdas a dendrogram usingas.dendrogram()onhc. - Print the
labelsofhcdto the console. - Use
branches_attr_by_labels()to color the branches. Pass it three arguments: thehcdobject,c("marvin", "gaye"), and the color"red". Assign tohcd_colored. plot()the dendrogramhcd_coloredwith the title"Better Dendrogram", added using themainargument.- Add rectangles to the plot using
rect.dendrogram(). Specifyk = 2clusters and abordercolor of"grey50".
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Create hcd
___ <- ___(___)
# Print the labels in hcd
___(___)
# Change the branch color to red for "marvin" and "gaye"
___ <- ___(___, ___, ___)
# Plot hcd_colored
___(___, ___)
# Add cluster rectangles
___(___, ___, ___)