1. Learn
  2. /
  3. Courses
  4. /
  5. Designing Machine Learning Workflows in Python

Exercise

Your first pipeline

Your colleague has used AdaBoostClassifier for the credit scoring dataset. You want to also try out a random forest classifier. In this exercise, you will fit this classifier to the data and compare it to AdaBoostClassifier. Make sure to use train/test data splitting to avoid overfitting. The data is preloaded and transformed so that all features are numeric. The features are available as X and the labels as y. The module RandomForestClassifier has also been preloaded.

Instructions 1/3

undefined XP
    1
    2
    3
  • Split the data into train (X_train and y_train) and test (X_test and y_test). Use 20% of the examples for the test set.