Feature selection
While preparing your data for modeling, it is important to ensure that you have a set of helpful features for the model to base its predictions (or diagnosis) on. In order to be helpful, features need to capture essential characteristics of the heart disease dataset in an orthogonal way; more data isn't always better!
You can use the sklearn.feature_selection.SelectFromModel
module to select useful features. SelectFromModel
implements a brute-force method that uses a RandomForestClassifier
model to find the most salient features for the task of heart disease diagnosis.
RandomForestClassifier
has been imported and the heart disease data features and target have been imported as X_train
and y_train
, respectively.
Diese Übung ist Teil des Kurses
End-to-End Machine Learning
Interaktive Übung
Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.
from sklearn.feature_selection import SelectFromModel
# Define the random forest model and fit to the training data
rf = ____(____=____, ____=____, ____=____)
rf.____(____, ____)