K-means on a soccer field
In the previous chapter, you used the lineup
dataset to learn about hierarchical clustering, in this chapter you will use the same data to learn about k-means clustering.
As a reminder, the lineup
data frame contains the positions of 12 players at the start of a 6v6 soccer match.
Just like before, you know that this match has two teams on the field so you can perform a k-means analysis using k = 2 in order to determine which player belongs to which team.
Note that in the kmeans()
function k
is specified using the centers
parameter.
This exercise is part of the course
Cluster Analysis in R
Exercise instructions
- Build a k-means model called
model_km2
for thelineup
data using thekmeans()
function withcenters = 2
. - Extract the vector of cluster assignments from the model
model_km2$cluster
and store this in the variableclust_km2
. - Append the cluster assignments as a column
cluster
to thelineup
data frame and save the results to a new data frame calledlineup_km2
. - Use ggplot to plot the positions of each player on the field and color them by their cluster.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Build a kmeans model
model_km2 <- kmeans(___, centers = ___)
# Extract the cluster assignment vector from the kmeans model
clust_km2 <- ___
# Create a new data frame appending the cluster assignment
lineup_km2 <- mutate(___, cluster = ___)
# Plot the positions of the players and color them using their cluster
ggplot(___, aes(x = ___, y = ___, color = factor(___))) +
geom_point()