1. Learn
  2. /
  3. Courses
  4. /
  5. End-to-End Machine Learning

Connected

Exercise

Feature selection

While preparing your data for modeling, it is important to ensure that you have a set of helpful features for the model to base its predictions (or diagnosis) on. In order to be helpful, features need to capture essential characteristics of the heart disease dataset in an orthogonal way; more data isn't always better!

You can use the sklearn.feature_selection.SelectFromModel module to select useful features. SelectFromModel implements a brute-force method that uses a RandomForestClassifier model to find the most salient features for the task of heart disease diagnosis.

RandomForestClassifier has been imported and the heart disease data features and target have been imported as X_train and y_train, respectively.

Instructions 1/4

undefined XP
    1
    2
    3
    4
  • Define a random forest classifier with n_jobs = -1, 'balanced' class_weight, and max_depth = 5, and perform feature selection on heart_disease_df using .fit().