Get Started

SHAP kernel explainer

1. SHAP kernel explainer

Let's continue our exploration of SHAP, this time focusing on the Kernel Explainer.

2. SHAP kernel explainer

Unlike the Tree Explainer, the Kernel Explainer is a general explainer that can derive SHAP values from any machine learning model, including complex models like K-nearest neighbors neural networks, and even tree-based models. Both explainers use the same concept to calculate feature importance, but the Kernel Explainer is slower since it doesn't leverage model-specific structures. However, it works across all models. While it's best to use a type-specific explainer when available, the Kernel Explainer remains universally applicable. Let’s look at how to apply the Kernel Explainer in code.

3. Heart disease

Just as with the Tree Explainer, we will use Kernel Explainers to analyze two models, one for regression, and one for classification. But this time, we will use neural network models. The first model is an MLPClassifier named mlp_clf, trained to predict the risk of heart disease in patients based on features such as chest pain type and heart rate.

4. Insurance charges

The second model is an MLPRegressor named mlp_reg, trained on the insurance dataset to predict the charges a person pays based on features such as smoking status and age.

5. Creating kernel explainers

We begin by importing SHAP. We then create an explainer using shap.KernelExplainer, which requires two parameters: the model’s prediction function and a representative summary of the dataset, typically a sample that captures the main patterns in the data.

6. Creating kernel explainers

Just like any other sklearn model, for the MLPRegressor, we use mlp_reg.predict to get the predicted charges, and for the MLPClassifier, we use mlp_clf.predict_proba to obtain probability predictions,

7. Creating kernel explainers

Instead of using the entire dataset as the summary, which can slow down computations, it’s more efficient to use a subset. The shap.kmeans function clusters the dataset into a specified number of clusters, 10 in our case, and uses the centroids, which are the center points of each cluster, as representatives. This approach provides more representativeness and quicker computations. Next, we calculate the shap_values by calling explainer.shap_values while providing the dataset.

8. Feature importance

To compute feature importance, we aggregate the SHAP values by calculating the mean absolute values across all samples. For classification, just as before, we select the index 1. Plotting the results, we see that for insurance charges, age and smoking status are the most important, while for heart disease, chest pain type and thalassemia are the most influential features, which aligns well with insurance and medical industry insights.

9. Comparing with model-specific approaches

Comparing these results with coefficients derived from model-specific linear and logistic regression, we notice that the patterns are similar. While the actual values and computation methods differ, the key features identified remain consistent. However, the major advantage of using model-agnostic approaches is their applicability to any machine learning model, including complex ones such as neural networks, and not being limited to specific model types. This makes them powerful tools in AI, where understanding and interpreting diverse models is crucial for building transparent and reliable systems.

10. Let's practice!

Now, Let's practice!