Explore wholesale customer clusters
Continuing your work on the wholesale dataset you are now ready to analyze the characteristics of these clusters.
Since you are working with more than 2 dimensions it would be challenging to visualize a scatter plot of the clusters, instead you will rely on summary statistics to explore these clusters. In this exercise you will analyze the mean amount spent in each cluster for all three categories.
Este exercício faz parte do curso
Cluster Analysis in R
Instruções do exercício
- Calculate the size of each cluster using
count()
. - Color & plot the dendrogram using the height of 15,000.
- Calculate the average spending for each category within each cluster using the
summarise_all()
function.
Exercício interativo prático
Experimente este exercício completando este código de exemplo.
dist_customers <- dist(customers_spend)
hc_customers <- hclust(dist_customers)
clust_customers <- cutree(hc_customers, h = 15000)
segment_customers <- mutate(customers_spend, cluster = clust_customers)
# Count the number of customers that fall into each cluster
count(___, ___)
# Color the dendrogram based on the height cutoff
dend_customers <- as.dendrogram(hc_customers)
dend_colored <- color_branches(___, ___)
# Plot the colored dendrogram
# Calculate the mean for each category
segment_customers %>%
group_by(cluster) %>%
summarise_all(list(mean))