CommencerCommencer gratuitement

Default thresholding

You would like to confirm that the DecisionTreeClassifier() uses the same default classification threshold as mentioned in the previous lesson, namely 0.5. It seems strange to you that all classifiers should use the same threshold. Let's check! A fitted decision tree classifier clf has been preloaded for you, as have the training and test data with their usual names: X_train, X_test, y_train and y_test. You will have to extract probability scores from the classifier using the .predict_proba() method.

Cet exercice fait partie du cours

Designing Machine Learning Workflows in Python

Afficher le cours

Instructions

  • Produce scores for the test examples, using the preloaded classifier clf.
  • Now extract labels from the scores. Remember that you have a pair of scores for each example, not a single score, and the second element is the probability of the positive class.
  • Now label the test data using the standard .predict() method
  • Finally, compare with the predictions you got before. Are they identical?

Exercice interactif pratique

Essayez cet exercice en complétant cet exemple de code.

# Score the test data using the given classifier
scores = clf.____(____)

# Get labels from the scores using the default threshold
preds = [s[____] > ____ for s in scores]

# Use the predict method to label the test data again
preds_default = clf.____(____)

# Compare the two sets of predictions
____(preds == preds_default)
Modifier et exécuter le code