1. Learn
  2. /
  3. Courses
  4. /
  5. Designing Machine Learning Workflows in Python

Exercise

Contamination revisited

You notice that one-class SVM does not have a contamination parameter. But you know well by now that you really need a way to control the proportion of examples that are labeled as novelties in order to control your false positive rate. So you decide to experiment with thresholding the scores. The detector has been imported as onesvm, you also have available the data as X_train, X_test, y_train, y_test, numpy as np, and confusion_matrix().

Instructions

100 XP
  • Fit the 1-class SVM and score the test data.
  • Compute the observed proportion of outliers in the test data.
  • Use np.quantile() to find where to threshold the scores to achieve that proportion.
  • Use that threshold to label the test data. Print the confusion matrix.