KNN with outlier probabilities
Since we cannot wholly trust the output when using contamination
, let's double-check our work using outlier probabilities. They are more trustworthy.
The dataset has been loaded as females
and KNN
estimator is also imported.
Diese Übung ist Teil des Kurses
Anomaly Detection in Python
Anleitung zur Übung
- Instantiate
KNN
with 20 neighbors. - Calculate outlier probabilities.
- Create a boolean mask that returns true values where the outlier probability is over 55%.
- Use
is_outlier
to filter the outliers fromfemales
.
Interaktive Übung
Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.
# Instantiate a KNN with 20 neighbors and fit to `females`
knn = ____
knn.____
# Calculate probabilities
probs = ____
# Create a boolean mask
is_outlier = ____
# Use the boolean mask to filter the outliers
outliers = ____
print(len(outliers))