Get startedGet started for free

Explore wholesale customer clusters

Continuing your work on the wholesale dataset you are now ready to analyze the characteristics of these clusters.

Since you are working with more than 2 dimensions it would be challenging to visualize a scatter plot of the clusters, instead you will rely on summary statistics to explore these clusters. In this exercise you will analyze the mean amount spent in each cluster for all three categories.

This exercise is part of the course

Cluster Analysis in R

View Course

Exercise instructions

  • Calculate the size of each cluster using count().
  • Color & plot the dendrogram using the height of 15,000.
  • Calculate the average spending for each category within each cluster using the summarise_all() function.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

dist_customers <- dist(customers_spend)
hc_customers <- hclust(dist_customers)
clust_customers <- cutree(hc_customers, h = 15000)
segment_customers <- mutate(customers_spend, cluster = clust_customers)

# Count the number of customers that fall into each cluster
count(___, ___)

# Color the dendrogram based on the height cutoff
dend_customers <- as.dendrogram(hc_customers)
dend_colored <- color_branches(___, ___)

# Plot the colored dendrogram


# Calculate the mean for each category
segment_customers %>% 
  group_by(cluster) %>% 
  summarise_all(list(mean))
Edit and Run Code