Get startedGet started for free

Local explainability with SHAP

1. Local explainability with SHAP

Let's dive into the concept of local explainability and discover how to apply it using SHAP.

2. Global vs. local explainability

So far, we've been exploring global explainability, which explains a model's overall behavior by pinpointing the features that most significantly impact its predictions. However, this method doesn't explain the rationale behind a model's decision for individual instances. Local explainability fills this gap by explaining the model's prediction for a specific data point. This is particularly vital in sensitive applications like finance or healthcare, where understanding precise outcomes is crucial. Consider the analogy of a professor grading final exams: global explainability is like discussing the general performance trends across all students, while local explainability focuses on the reasons behind a specific student's grade. In AI, both perspectives are important to ensure transparency, fairness, and trust in automated decision-making systems.

3. Heart disease dataset

Let's use SHAP to generate local explanations for a KNN classifier, trained on the heart disease dataset.

4. Local explainability with SHAP

As with global explanations, we create a Kernel Explainer by providing the KNN model’s prediction function and a representative summary of the training data. Unlike global methods that explain the entire dataset, we focus on one prediction at a time. Therefore, we define a test_instance by selecting a specific row from the dataset, then get its shap_values. The shape of these values is the number of features multiplied by two, as we are dealing with a binary classification problem. To better understand how each feature contributes to the prediction, we need visualizations.

5. SHAP waterfall plots

SHAP waterfall plots break down the prediction, showing how each feature either increases or decreases the model's prediction.

6. SHAP waterfall plots

The plot displays a baseline value, representing the model's average prediction across all samples, which serves as a starting point for the plot.

7. SHAP waterfall plots

It then shows how each feature either pushes the prediction higher or pulls it lower. For instance, here we see that age and resting_blood_pressure push the heart disease risk higher, while resting_ecg_results pulls it lower.

8. Creating waterfall plots

To create such plots, we use shap.waterfall_plot by providing it with a shap.Explanation instance, constructed with several key components: values, the shap_values of the positive class for the instance; base_values, representing the baseline of the plot, denoting the average prediction for class 1, accessed through explainer.expected_value[1]; data, the actual feature values of the test_instance being explained; and feature_names.

9. Waterfalls for several instances

By examining four waterfall plots, we see how the model’s decision-making varies across instances. Comparing these plots, we can identify which features have the most significant impact on each decision and how these impacts vary from one case to another. This helps in understanding the model's behavior in real-world scenarios, ensuring that the model's decisions are transparent and can be trusted or further scrutinized as needed.

10. Let's practice!

Time to practice!