Silhouette method
In the last lesson, you got a taste of how different numbers of clusters affects the performance of your K-Means algorithm. This is especially poignant in the context of an interview, as the optimal number of clusters generates the best results.
In this exercise, you will be using the silhouette_score()
function from sklearn.metrics
on K-Means algorithms ran on the diabetes
DataFrame in order to perform the Silhouette method for finding the optimal number of clusters. Note you will be using euclidian distance when calculating the score as it ensures comparability between it and the Elbow method.
The feature matrix X
which you'll use to train the K-Means models has been created for you.
You're at the same place in the pipeline as the last few exercises, but here you'll add predicting as well:
Diese Übung ist Teil des Kurses
Practicing Machine Learning Interview Questions in Python
Interaktive Übung
Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.
# Import modules
from sklearn.____ import ____
from sklearn.____ import ____