Uniform clustering patterns
Now that you are familiar with the impact of seeds, let us look at the bias in k-means clustering towards the formation of uniform clusters.
Let us use a mouse-like dataset for our next exercise. A mouse-like dataset is a group of points that resemble the head of a mouse: it has three clusters of points arranged in circles, one each for the face and two ears of a mouse.
Here is how a typical mouse-like dataset looks like (Source).
The data is stored in a pandas DataFrame, mouse
. x_scaled
and y_scaled
are the column names of the standardized X and Y coordinates of the data points.
This exercise is part of the course
Cluster Analysis in Python
Exercise instructions
- Import
kmeans
andvq
functions in SciPy. - Generate cluster centers using the
kmeans()
function with three clusters. - Create cluster labels with
vq()
with the cluster centers generated above.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Import the kmeans and vq functions
____
# Generate cluster centers
cluster_centers, distortion = ____
# Assign cluster labels
mouse['cluster_labels'], distortion_list = ____
# Plot clusters
sns.scatterplot(x='x_scaled', y='y_scaled',
hue='cluster_labels', data = mouse)
plt.show()