1. Learn
  2. /
  3. Courses
  4. /
  5. Designing Machine Learning Workflows in Python

Connected

Exercise

LoF contamination

Your medical advisor at the arrhythmia startup informs you that your training data might not contain all possible types of arrhythmia. How on earth will you detect these other types without any labeled examples? Could an anomaly detector tell the difference between healthy and unhealthy without access to labels? But first, you experiment with the contamination parameter to see its effect on the confusion matrix. You have LocalOutlierFactor as lof, numpy as np, the labels as ground_truth encoded in -1and 1 just like local outlier factor output, and the unlabeled training data as X.

Instructions 1/3

undefined XP
  • 1

    Fit a local outlier factor and output the predictions on X and print the confusion matrix for these predictions.

  • 2

    Repeat but now set the proportion of datapoints to be flagged as outliers to 0.2. Print the confusion matrix.

  • 3

    Now set the contamination to be equal to the actual proportion of outliers in the data.