ComenzarEmpieza gratis

Finding outliers with z-scores

The normal distribution is ubiquitous in the natural world and is the most common distribution. This is why the z-score method can be one of the quickest methods for detecting outliers.

Recall the rule of thumb from the video: if a sample is more than three standard away deviations from the mean, you can consider it an extreme value.

However, recall also that the z-score method should be approached with caution. This method is appropriate only when we are confident our data comes from a normal distribution. Otherwise, the results might be misleading.

The prices distribution has been loaded for you.

Este ejercicio forma parte del curso

Anomaly Detection in Python

Ver curso

Instrucciones del ejercicio

  • Import the zscore function from the relevant scipy module.
  • Find the z-scores of prices and store them into scores.
  • Create a boolean mask named is_over_3 to check if the absolute values of scores are greater than 3.
  • Use the mask to filter prices for outliers.

Ejercicio interactivo práctico

Prueba este ejercicio completando el código de muestra.

# Import the zscores function
from scipy.____ import ____

# Find the zscores of prices
scores = ____(____)

# Check if the absolute values of scores are over 3
is_over_3 = ____

# Use the mask to subset prices
outliers = ____[____]

print(len(outliers))
Editar y ejecutar código