Explore wholesale customer clusters
Continuing your work on the wholesale dataset you are now ready to analyze the characteristics of these clusters.
Since you are working with more than 2 dimensions it would be challenging to visualize a scatter plot of the clusters, instead you will rely on summary statistics to explore these clusters. In this exercise you will analyze the mean amount spent in each cluster for all three categories.
Deze oefening maakt deel uit van de cursus
Cluster Analysis in R
Oefeninstructies
- Calculate the size of each cluster using
count(). - Color & plot the dendrogram using the height of 15,000.
- Calculate the average spending for each category within each cluster using the
summarise_all()function.
Praktische interactieve oefening
Probeer deze oefening eens door deze voorbeeldcode in te vullen.
dist_customers <- dist(customers_spend)
hc_customers <- hclust(dist_customers)
clust_customers <- cutree(hc_customers, h = 15000)
segment_customers <- mutate(customers_spend, cluster = clust_customers)
# Count the number of customers that fall into each cluster
count(___, ___)
# Color the dendrogram based on the height cutoff
dend_customers <- as.dendrogram(hc_customers)
dend_colored <- color_branches(___, ___)
# Plot the colored dendrogram
# Calculate the mean for each category
segment_customers %>%
group_by(cluster) %>%
summarise_all(list(mean))