1. Learn
  2. /
  3. Courses
  4. /
  5. Cluster Analysis in Python

Exercise

K-means clustering: first exercise

This exercise will familiarize you with the usage of k-means clustering on a dataset. Let us use the Comic Con dataset and check how k-means clustering works on it.

Recall the two steps of k-means clustering:

  • Define cluster centers through kmeans() function. It has two required arguments: observations and number of clusters.
  • Assign cluster labels through the vq() function. It has two required arguments: observations and cluster centers.

The data is stored in a pandas DataFrame, comic_con. x_scaled and y_scaled are the column names of the standardized X and Y coordinates of people at a given point in time.

Instructions

100 XP
  • Import kmeans and vq functions in SciPy.
  • Generate cluster centers using the kmeans() function with two clusters.
  • Create cluster labels using these cluster centers.