ComenzarEmpieza gratis

K-means: Average Silhouette Widths

So hierarchical clustering resulting in 3 clusters and the elbow method suggests 2. In this exercise use average silhouette widths to explore what the "best" value of k should be.

Este ejercicio forma parte del curso

Cluster Analysis in R

Ver curso

Instrucciones del ejercicio

  • Use map_dbl() to run pam() using the oes data for k values ranging from 2 to 10 and extract the average silhouette width value from each model: model$silinfo$avg.width. Store the resulting vector as sil_width.
  • Build a new data frame sil_df containing the values of k and the vector of average silhouette widths.
  • Use the values in sil_df to plot a line plot showing the relationship between k and average silhouette width.

Ejercicio interactivo práctico

Prueba este ejercicio y completa el código de muestra.

# Use map_dbl to run many models with varying value of k
sil_width <- map_dbl(2:10,  function(k){
  model <- pam(___, k = ___)
  model$silinfo$avg.width
})

# Generate a data frame containing both k and sil_width
sil_df <- data.frame(
  k = ___,
  sil_width = ___
)

# Plot the relationship between k and sil_width
ggplot(___, aes(x = ___, y = ___)) +
  geom_line() +
  scale_x_continuous(breaks = 2:10)
Editar y ejecutar código