Beyond only accuracy
In this exercise, to go beyond just looking at accuracy, you will evaluate AUC of the ROC curve for a basic decision tree model. Remember that the baseline comparison for a random classifier is an AUC of 0.5, so you will want to achieve a higher score than 0.5.
X
is available as the DataFrame with features, and y
is available as a DataFrame with target values. Both sklearn
and pandas
as pd
are also available in your workspace.
We will use this set up to look at the AUC of our ROC curve.
Diese Übung ist Teil des Kurses
Predicting CTR with Machine Learning in Python
Anleitung zur Übung
- Split the data into training and testing sets.
- Fit the classifier using training data to make predictions for testing data using
predict_proba()
andpredict()
. - Evaluate the AUC under the ROC curve using the
roc_curve()
function ony_test
viaroc_curve(y_test, y_score[:, 1])
.
Interaktive Übung
Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.
# Training and testing
X_train, X_test, y_train, y_test = \
____(X, y, test_size = .2, random_state = 0)
# Create decision tree classifier
clf = DecisionTreeClassifier()
# Train classifier - predict probability score and label
y_score = clf.fit(____, ____).predict_proba(____)
y_pred = clf.fit(____, ____).predict(____)
# Get ROC curve metrics
fpr, tpr, thresholds = ____(____, y_score[:, 1])
roc_auc = auc(fpr, tpr)
print(roc_auc)