1. Learn
  2. /
  3. Courses
  4. /
  5. Cluster Analysis in R

Exercise

Many K's many models

While the lineup dataset clearly has a known value of k, often times the optimal number of clusters isn't known and must be estimated.

In this exercise you will leverage map_dbl() from the purrr library to run k-means using values of k ranging from 1 to 10 and extract the total within-cluster sum of squares metric from each one. This will be the first step towards visualizing the elbow plot.

Instructions

100 XP
  • Use map_dbl() to run kmeans() using the lineup data for k values ranging from 1 to 10 and extract the total within-cluster sum of squares value from each model: model$tot.withinss. Store the resulting vector as tot_withinss.
  • Build a new data frame elbow_df containing the values of k and the vector of total within-cluster sum of squares.