Your first pipeline
Your colleague has used AdaBoostClassifier
for the credit scoring dataset. You want to also try out a random forest classifier. In this exercise, you will fit this classifier to the data and compare it to AdaBoostClassifier
. Make sure to use train/test data splitting to avoid overfitting. The data is preloaded and transformed so that all features are numeric. The features are available as X
and the labels as y
. The module RandomForestClassifier
has also been preloaded.
Diese Übung ist Teil des Kurses
Designing Machine Learning Workflows in Python
Interaktive Übung
Vervollständige den Beispielcode, um diese Übung erfolgreich abzuschließen.
# Split the data into train and test, with 20% as test
X_train, ____, ____, y_test = train_test_split(
X, y, ____=0.2, random_state=1)