1. Learn
  2. /
  3. Courses
  4. /
  5. Anomaly Detection in Python

Exercise

Finding outliers with z-scores

The normal distribution is ubiquitous in the natural world and is the most common distribution. This is why the z-score method can be one of the quickest methods for detecting outliers.

Recall the rule of thumb from the video: if a sample is more than three standard away deviations from the mean, you can consider it an extreme value.

However, recall also that the z-score method should be approached with caution. This method is appropriate only when we are confident our data comes from a normal distribution. Otherwise, the results might be misleading.

The prices distribution has been loaded for you.

Instructions

100 XP
  • Import the zscore function from the relevant scipy module.
  • Find the z-scores of prices and store them into scores.
  • Create a boolean mask named is_over_3 to check if the absolute values of scores are greater than 3.
  • Use the mask to filter prices for outliers.