Feature importance in clustering with ARI

Leverage the Adjusted Rand Index (ARI) to quantitatively measure the impact of each feature's removal on cluster assignments in the customer dataset you've worked with in the previous exercise, pre-loaded in X.

The adjusted_rand_score() function and the column_names variable have been pre-loaded for you.

This exercise is part of the course

Explainable AI in Python

View Course

Exercise instructions

Derive the original cluster assignments in original_clusters.
In the for loop, remove features one by one and save the result in X_reduced.
Derive the reduced_clusters by applying K-means on X_reduced.
Compute the feature importance based on ARI between the reduced_clusters and the original_clusters.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

kmeans = KMeans(n_clusters=5, random_state=10, n_init=10).fit(X)
# Derive original clusters
original_clusters = ____

for i in range(X.shape[1]):
  	# Remove feature at index i
    X_reduced = ____
    # Derive reduced clusters
    reduced_clusters = ____
    # Derive feature importance
    importance = ____
    print(f'{column_names[i]}: {importance}')

Edit and Run Code