Using outlier probabilities
An alternative to isolating outliers with contamination
is using outlier probabilities. The best thing about this method is that you can choose an arbitrary probability threshold, which means you can be as confident as you want in the predictions.
IForest
and big_mart
are already loaded.
Diese Übung ist Teil des Kurses
Anomaly Detection in Python
Anleitung zur Übung
- Calculate probabilities for both inliers and outliers.
- Extract the probabilities for outliers into
outlier_probs
. - Filter the outliers into
outliers
by using a 70% threshold onoutlier_probs
.
Interaktive Übung
Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.
iforest = IForest(random_state=10).fit(big_mart)
# Calculate probabilities
probs = iforest.____
# Extract the probabilities for outliers
outlier_probs = ____[____]
# Filter for when the probability is higher than 70%
outliers = ____[____]
print(len(outliers))