Revisiting wholesale data: Exploration
From the previous analysis you have found that k = 2 has the highest average silhouette width. In this exercise you will continue to analyze the wholesale customer data by building and exploring a kmeans model with 2 clusters.
Cet exercice fait partie du cours
Cluster Analysis in R
Instructions
- Build a k-means model called
model_customersfor thecustomers_spenddata using thekmeans()function withcenters = 2. - Extract the vector of cluster assignments from the model
model_customers$clusterand store this in the variableclust_customers. - Append the cluster assignments as a column
clusterto thecustomers_spenddata frame and save the results to a new data frame calledsegment_customers. - Calculate the size of each cluster using
count().
Exercice interactif pratique
Essayez cet exercice en complétant cet exemple de code.
set.seed(42)
# Build a k-means model for the customers_spend with a k of 2
model_customers <- ___
# Extract the vector of cluster assignments from the model
clust_customers <- ___
# Build the segment_customers data frame
segment_customers <- mutate(___, cluster = ___)
# Calculate the size of each cluster
count(___, ___)
# Calculate the mean for each category
segment_customers %>%
group_by(cluster) %>%
summarise_all(list(mean))