LoslegenKostenlos loslegen

K-means clustering

In a machine learning interview setting, you might be asked how the output from K-means clustering might be used to assess its performance as the best algorithm.

In this exercise you'll practice K-means clustering. Using the .inertia_ attribute to compare models with different numbers of clusters, k, you'll then also use this information to assess cluster number in the next exercise.

Recall that the target variable in the diabetes dataset is progression.

Where you are in the pipeline:

Machine learning pipeline

Diese Übung ist Teil des Kurses

Practicing Machine Learning Interview Questions in Python

Kurs anzeigen

Interaktive Übung

Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.

# Import module
from sklearn.cluster import KMeans

# Create feature matrix
X = diabetes.____("____", axis=1)

# Instantiate
kmeans = KMeans(n_clusters=2, random_state=123)

# Fit
fit = kmeans.____(____)

# Print inertia
print("Sum of squared distances for 2 clusters is", kmeans.inertia_)
Code bearbeiten und ausführen