Get startedGet started for free

Your first pipeline

Your colleague has used AdaBoostClassifier for the credit scoring dataset. You want to also try out a random forest classifier. In this exercise, you will fit this classifier to the data and compare it to AdaBoostClassifier. Make sure to use train/test data splitting to avoid overfitting. The data is preloaded and transformed so that all features are numeric. The features are available as X and the labels as y. The module RandomForestClassifier has also been preloaded.

This exercise is part of the course

Designing Machine Learning Workflows in Python

View Course

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Split the data into train and test, with 20% as test
X_train, ____, ____, y_test = train_test_split(
  X, y, ____=0.2, random_state=1)
Edit and Run Code