1. Learn
  2. /
  3. Courses
  4. /
  5. Machine Learning for Time Series Data in Python

Exercise

Handling outliers

In this exercise, you'll handle outliers - data points that are so different from the rest of your data, that you treat them differently from other "normal-looking" data points. You'll use the output from the previous exercise (percent change over time) to detect the outliers. First you will write a function that replaces outlier data points with the median value from the entire time series.

Instructions

100 XP
  • Define a function that takes an input series and does the following:
    • Calculates the absolute value of each datapoint's distance from the series mean, then creates a boolean mask for datapoints that are three times the standard deviation from the mean.
    • Use this boolean mask to replace the outliers with the median of the entire series.
  • Apply this function to your data and visualize the results using the given code.