Get Started

Sorting important features

Among other things, Decision Trees are very popular because of their interpretability. Many models can provide accurate predictions, but Decision Trees can also quantify the effect of the different features on the target. Here, it can tell you which features have the strongest and weakest impacts on the decision to leave the company. In sklearn, you can get this information by using the feature_importances_ attribute.

In this exercise, you're going to get the quantified importance of each feature, save them in a pandas DataFrame (a Pythonic table), and sort them from the most important to the less important. The model_ best Decision Tree Classifier used in the previous exercises is available in your workspace, as well as the features_test and features_train variables.

pandas has been imported as pd.

This is a part of the course

“HR Analytics: Predicting Employee Churn in Python”

View Course

Exercise instructions

  • Use the feature_importances_ attribute to calculate relative feature importances
  • Create a list of features
  • Save the results inside a DataFrame using the DataFrame() function, where the features are rows and their respective values are a column
  • Sort the relative_importances DataFrame to get the most important features on top using the sort_values() function and print the result

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Calculate feature importances
feature_importances = model_best.____

# Create a list of features: done
feature_list = list(features)

# Save the results inside a DataFrame using feature_list as an index
relative_importances = pd.____(index=____, data=feature_importances, columns=["importance"])

# Sort values to learn most important features
relative_importances.____(by="importance", ascending=False)
Edit and Run Code