Feature impact on cluster quality
Explore how individual features impact the clustering performance of a KMeans model. The dataset X is used for customer segmentation based on three features: income, number of kids, and number of teens in the house.
The silhouette_score function and the column_names variable have been pre-loaded for you.
Latihan ini adalah bagian dari kursus
Explainable AI in Python
Petunjuk latihan
- Derive the original silhouette score (
original_score). - In the for loop, remove features one by one and save the result in
X_reduced. - Compute the new silhouette score (
new_score). - Compute the
impactof the feature.
Latihan interaktif praktis
Cobalah latihan ini dengan menyelesaikan kode contoh berikut.
kmeans = KMeans(n_clusters=5, random_state=10, n_init=10).fit(X)
# Derive the original silhouette score
original_score = ____
for i in range(X.shape[1]):
# Remove feature at index i
X_reduced = ____
kmeans.fit(X_reduced)
# Compute the new silhouette score
new_score = ____
# Compute the feature's impact
impact = ____
print(f'Feature {column_names[i]}: Impact = {impact}')