Get startedGet started for free

Permutation importance

1. Permutation importance

Welcome back! We've explored some model-specific approaches to explainability; let's dive into some model-agnostic techniques, starting with permutation importance.

2. Shuffling notes to determine instrument's importance

Imagine a band where each musician's instrument is vital to a song. To gauge each instrument's importance, we randomly shuffle a musician's notes and check the impact on the song's quality. If the quality drops significantly, that instrument is crucial to the performance. This concept is similar to permutation importance, where shuffling a feature's data reveals its effect on model performance.

3. Permutation importance

Permutation importance is a powerful model-agnostic method that assesses feature importance by measuring the effect of feature shuffling on model performance. Unlike model-specific methods, permutation importance can be applied to any machine learning model, making it highly versatile. Remember neural networks? Now, we can use permutation importance to explain their predictions.

4. Permutation importance in action

Let's say we have a model trained on a dataset containing five features x1 till x5.

5. Permutation importance in action

To derive feature importance using permutation importance, we first make predictions on a dataset and calculate the baseline performance metric.

6. Permutation importance in action

Then, we shuffle one feature at a time, keeping other features unchanged, and compute the shuffled performance.

7. Permutation importance in action

The importance of the feature we shuffled is proportional to the drop in performance.

8. Permutation importance in action

A significant drop indicates a high importance for that feature, while a small drop suggests lower importance.

9. Admissions dataset

Let’s apply this on the graduate admissions dataset. Assume we have our training data in X_train and y_train.

10. MLPClassifier

Assume our model is a neural network, built using the MLPClassifier with two hidden layers of 10 neurons each. We fit the model on the training data.

11. Permutation importance

Next, we import the permutation_importance function from sklearn.inspection and calculate permutation importance on the trained model. We provide the model, data, number of repeats, indicating how many times to shuffle one feature. Generally, higher values provide more stable results but take longer to compute, random_state for reproducibility, and the scoring metric, which is accuracy in our case. result.importances_mean gives the average importance scores for each feature. To better interpret these scores and see which feature corresponds to each score,

12. Visualizing importance

we use a bar plot, passing in the column names and result.importances_mean, making the results clearer and more intuitive. We can see that CGPA and test scores are the most influential features to predict acceptance.

13. Comparison with model-specific approaches

Comparing these importances with model-specific approaches such as the coefficients of a logistic regression model, we can see that they mostly align, even though they are computed differently and have different values. Both methods show highest importance for CGPA and test scores. But the main advantage of model-agnostic approaches is their applicability to any type of model, unlike model-specific ones.

14. Let's practice!

Now, let’s practice!