Get startedGet started for free

Random forest feature importances

One useful aspect of tree-based methods is the ability to extract feature importances. This is a quantitative way to measure how much each feature contributes to our predictions. It can help us focus on our best features, possibly enhancing or tuning them, and can also help us get rid of useless features that may be cluttering up our model.

Tree models in sklearn have a .feature_importances_ property that's accessible after fitting the model. This stores the feature importance scores. We need to get the indices of the sorted feature importances using np.argsort() in order to make a nice-looking bar plot of feature importances (sorted from greatest to least importance).

This exercise is part of the course

Machine Learning for Finance in Python

View Course

Exercise instructions

  • Use the feature_importances_ property of our random forest model (rfr) to extract feature importances into the importances variable.
  • Use numpy's argsort to get indices of the feature importances from greatest to least, and save the sorted indices in the sorted_index variable.
  • Set xtick labels to be feature names in the labels variable, using the sorted_index list. feature_names must be converted to a numpy array so we can index it with the sorted_index list.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Get feature importances from our random forest model
importances = rfr.____

# Get the index of importances from greatest importance to least
sorted_index = ____(importances)[::-1]
x = range(len(importances))

# Create tick labels 
labels = np.array(____)[____]
plt.bar(x, importances[sorted_index], tick_label=labels)

# Rotate tick labels to vertical
plt.xticks(rotation=90)
plt.show()
Edit and Run Code