Get Started

Elbow (Scree) plot

In the previous exercises you have calculated the total within-cluster sum of squares for values of k ranging from 1 to 10. You can visualize this relationship using a line plot to create what is known as an elbow plot (or scree plot).

When looking at an elbow plot you want to see a sharp decline from one k to another followed by a more gradual decrease in slope. The last value of k before the slope of the plot levels off suggests a "good" value of k.

This is a part of the course

“Cluster Analysis in R”

View Course

Exercise instructions

  • Continuing your work from the previous exercise, use the values in elbow_df to plot a line plot showing the relationship between k and total within-cluster sum of squares.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Use map_dbl to run many models with varying value of k (centers)
tot_withinss <- map_dbl(1:10,  function(k){
  model <- kmeans(x = lineup, centers = k)
  model$tot.withinss
})

# Generate a data frame containing both k and tot_withinss
elbow_df <- data.frame(
  k = 1:10,
  tot_withinss = tot_withinss
)

# Plot the elbow plot
ggplot(___, aes(x = ___, y = ___)) +
  geom_line() +
  scale_x_continuous(breaks = 1:10)
Edit and Run Code