Hierarchical clustering: Occupation trees
In the previous exercise you have learned that the oes
data is ready for hierarchical clustering without any preprocessing steps necessary. In this exercise you will take the necessary steps to build a dendrogram of occupations based on their yearly average salaries and propose clusters using a height of 100,000
.
This exercise is part of the course
Cluster Analysis in R
Exercise instructions
- Calculate the Euclidean distance between the occupations and store this in
dist_oes
. - Run hierarchical clustering using average linkage and store in
hc_oes
. - Create a dendrogram object
dend_oes
from yourhclust
result using the functionas.dendrogram()
. - Plot the dendrogram.
- Using the
color_branches()
function create & plot a new dendrogram with clusters colored by a cut height of 100,000.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Calculate Euclidean distance between the occupations
dist_oes <- dist(___, method = ___)
# Generate an average linkage analysis
hc_oes <- hclust(___, method = ___)
# Create a dendrogram object from the hclust variable
dend_oes <- as.dendrogram(___)
# Plot the dendrogram
plot(___)
# Color branches by cluster formed from the cut at a height of 100000
dend_colored <- color_branches(___, h = ___)
# Plot the colored dendrogram
plot(___)