Exercise

Default thresholding

You would like to confirm that the DecisionTreeClassifier() uses the same default classification threshold as mentioned in the previous lesson, namely 0.5. It seems strange to you that all classifiers should use the same threshold. Let's check! A fitted decision tree classifier clf has been preloaded for you, as have the training and test data with their usual names: X_train, X_test, y_train and y_test. You will have to extract probability scores from the classifier using the .predict_proba() method.

Instructions

100 XP
  • Produce scores for the test examples, using the preloaded classifier clf.
  • Now extract labels from the scores. Remember that you have a pair of scores for each example, not a single score, and the second element is the probability of the positive class.
  • Now label the test data using the standard .predict() method
  • Finally, compare with the predictions you got before. Are they identical?