K-means clustering
In a machine learning interview setting, you might be asked how the output from K-means clustering might be used to assess its performance as the best algorithm.
In this exercise you'll practice K-means clustering. Using the .inertia_
attribute to compare models with different numbers of clusters, k
, you'll then also use this information to assess cluster number in the next exercise.
Recall that the target variable in the diabetes
dataset is progression
.
Where you are in the pipeline:
This exercise is part of the course
Practicing Machine Learning Interview Questions in Python
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Import module
from sklearn.cluster import KMeans
# Create feature matrix
X = diabetes.____("____", axis=1)
# Instantiate
kmeans = KMeans(n_clusters=2, random_state=123)
# Fit
fit = kmeans.____(____)
# Print inertia
print("Sum of squared distances for 2 clusters is", kmeans.inertia_)