Get startedGet started for free

K-means on a soccer field

In the previous chapter, you used the lineup dataset to learn about hierarchical clustering, in this chapter you will use the same data to learn about k-means clustering. As a reminder, the lineup data frame contains the positions of 12 players at the start of a 6v6 soccer match.

Just like before, you know that this match has two teams on the field so you can perform a k-means analysis using k = 2 in order to determine which player belongs to which team.

Note that in the kmeans() function k is specified using the centers parameter.

This exercise is part of the course

Cluster Analysis in R

View Course

Exercise instructions

  • Build a k-means model called model_km2 for the lineup data using the kmeans() function with centers = 2.
  • Extract the vector of cluster assignments from the model model_km2$cluster and store this in the variable clust_km2.
  • Append the cluster assignments as a column cluster to the lineup data frame and save the results to a new data frame called lineup_km2.
  • Use ggplot to plot the positions of each player on the field and color them by their cluster.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Build a kmeans model
model_km2 <- kmeans(___, centers = ___)

# Extract the cluster assignment vector from the kmeans model
clust_km2 <- ___

# Create a new data frame appending the cluster assignment
lineup_km2 <- mutate(___, cluster = ___)

# Plot the positions of the players and color them using their cluster
ggplot(___, aes(x = ___, y = ___, color = factor(___))) +
  geom_point()
Edit and Run Code