Session Ready
Exercise

Limiting the sample size

Another method to prevent overfitting is to specify the minimum number of observations necessary to grow a leaf (or node), in the Decision Tree.

In this exercise, you will:

  • set this minimum limit to 100
  • fit the new model to the employee data
  • examine prediction results on both training and test sets

The variables features_train, target_train, features_test and target_test are already available in your workspace.

Instructions
100 XP
  • Initialize the DecisionTreeClassifier and set the leaf minimum limit to 100 observations
  • Fit the decision tree model to the training data.
  • Check the accuracy of the predictions on both the training and test sets.