Your first pipeline
Your colleague has used AdaBoostClassifier for the credit scoring dataset. You want to also try out a random forest classifier. In this exercise, you will fit this classifier to the data and compare it to AdaBoostClassifier. Make sure to use train/test data splitting to avoid overfitting. The data is preloaded and transformed so that all features are numeric. The features are available as X and the labels as y. The module RandomForestClassifier has also been preloaded.
Questo esercizio fa parte del corso
Designing Machine Learning Workflows in Python
Esercizio pratico interattivo
Prova a risolvere questo esercizio completando il codice di esempio.
# Split the data into train and test, with 20% as test
X_train, ____, ____, y_test = train_test_split(
X, y, ____=0.2, random_state=1)