Your first pipeline
Your colleague has used AdaBoostClassifier
for the credit scoring dataset. You want to also try out a random forest classifier. In this exercise, you will fit this classifier to the data and compare it to AdaBoostClassifier
. Make sure to use train/test data splitting to avoid overfitting. The data is preloaded and transformed so that all features are numeric. The features are available as X
and the labels as y
. The module RandomForestClassifier
has also been preloaded.
Cet exercice fait partie du cours
Designing Machine Learning Workflows in Python
Exercice interactif pratique
Essayez cet exercice en complétant cet exemple de code.
# Split the data into train and test, with 20% as test
X_train, ____, ____, y_test = train_test_split(
X, y, ____=0.2, random_state=1)