Revisiting wholesale data: Exploration

From the previous analysis you have found that k = 2 has the highest average silhouette width. In this exercise you will continue to analyze the wholesale customer data by building and exploring a kmeans model with 2 clusters.

Este exercício faz parte do curso

Cluster Analysis in R

Ver curso

Instruções do exercício

Build a k-means model called model_customers for the customers_spend data using the kmeans() function with centers = 2.
Extract the vector of cluster assignments from the model model_customers$cluster and store this in the variable clust_customers.
Append the cluster assignments as a column cluster to the customers_spend data frame and save the results to a new data frame called segment_customers.
Calculate the size of each cluster using count().

Exercício interativo prático

Experimente este exercício completando este código de exemplo.

set.seed(42)

# Build a k-means model for the customers_spend with a k of 2
model_customers <- ___

# Extract the vector of cluster assignments from the model
clust_customers <- ___

# Build the segment_customers data frame
segment_customers <- mutate(___, cluster = ___)

# Calculate the size of each cluster
count(___, ___)

# Calculate the mean for each category
segment_customers %>% 
  group_by(cluster) %>% 
  summarise_all(list(mean))

Editar e executar o código