Get startedGet started for free

Evaluate the optimal tree

In this exercise, you'll evaluate the test set ROC AUC score of grid_dt's optimal model.

In order to do so, you will first determine the probability of obtaining the positive label for each test set observation. You can use the methodpredict_proba() of an sklearn classifier to compute a 2D array containing the probabilities of the negative and positive class-labels respectively along columns.

The dataset is already loaded and processed for you (numerical features are standardized); it is split into 80% train and 20% test. X_test, y_test are available in your workspace. In addition, we have also loaded the trained GridSearchCV object grid_dt that you instantiated in the previous exercise. Note that grid_dt was trained as follows:

grid_dt.fit(X_train, y_train)

This exercise is part of the course

Machine Learning with Tree-Based Models in Python

View Course

Exercise instructions

  • Import roc_auc_score from sklearn.metrics.

  • Extract the .best_estimator_ attribute from grid_dt and assign it to best_model.

  • Predict the test set probabilities of obtaining the positive class y_pred_proba.

  • Compute the test set ROC AUC score test_roc_auc of best_model.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Import roc_auc_score from sklearn.metrics
____

# Extract the best estimator
best_model = ____

# Predict the test set probabilities of the positive class
y_pred_proba = ____

# Compute test_roc_auc
test_roc_auc = ____

# Print test_roc_auc
print('Test set ROC AUC score: {:.3f}'.format(test_roc_auc))
Edit and Run Code