Rolling 360-day median & std. deviation for nyc ozone data since 2000

The last video also showed you how to calculate several rolling statistics using the .agg() method, similar to .groupby().

Let's take a closer look at the air quality history of NYC using the Ozone data you have seen before. The daily data are very volatile, so using a longer term rolling average can help reveal a longer term trend.

You'll be using a 360 day rolling window, and .agg() to calculate the rolling mean and standard deviation for the daily average ozone values since 2000.

This exercise is part of the course

Manipulating Time Series Data in Python

View Course

Exercise instructions

We have already imported pandas as pd, and matplotlib.pyplot as plt.

  • Use pd.read_csv() to import 'ozone.csv', creating a DateTimeIndex from the 'date' column using parse_dates and index_col, assign the result to data, and drop missing values using .dropna().
  • Select the 'Ozone' column and create a .rolling() window using 360 periods, apply .agg() to calculate the mean and std, and assign this to rolling_stats.
  • Use .join() to concatenate data with rolling_stats, and assign to stats.
  • Plot stats using subplots.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Import and inspect ozone data here
data = ____

# Calculate the rolling mean and std here
rolling_stats = ____

# Join rolling_stats with ozone data
stats = ____

# Plot stats