Aan de slagGa gratis aan de slag

Selecting important features

In this exercise, your task is to select only the most important features that will be used by the final model. Remember, that the relative importances are saved in the column importance of the DataFrame called relative_importances.

Deze oefening maakt deel uit van de cursus

HR Analytics: Predicting Employee Churn in Python

Cursus bekijken

Oefeninstructies

  • Select only the features with an importance value higher than 1%.
  • Create a list from those features and print them (this has been done for you).
  • Using the index saved in selected_list, transform both features_train and features_test to include the features with an importance higher than 1% only.

Praktische interactieve oefening

Probeer deze oefening eens door deze voorbeeldcode in te vullen.

# select only features with relative importance higher than 1%
selected_features = relative_importances[relative_importances.____>0.01]

# create a list from those features: done
selected_list = selected_features.index

# transform both features_train and features_test components to include only selected features
features_train_selected = features_train[selected_list]
features_test_selected = ____[____]
Code bewerken en uitvoeren