Exercise

Gradient boosting feature importances

As with random forests, we can extract feature importances from gradient boosting models to understand which features are the best predictors. Sometimes it's nice to try different tree-based models and look at the feature importances from all of them. This can help average out any peculiarities that may arise from one particular model.

The feature importances are stored as a numpy array in the .feature_importances_ property of the gradient boosting model. We'll need to get the sorted indices of the feature importances, using np.argsort(), in order to make a nice plot. We want the features from largest to smallest, so we will use Python's indexing to reverse the sorted importances like feat_importances[::-1].

Instructions

100 XP
  • Reverse the sorted_index variable to go from greatest to least using python indexing.
  • Create the sorted feature labels list as labels by converting feature_names to a numpy array and indexing with sorted_index.
  • Create a bar plot of the xticks, and feature_importances indexed with the sorted_index variable, and labels as the xtick labels.