Rolling 360-day median & std. deviation for nyc ozone data since 2000
The last video also showed you how to calculate several rolling statistics using the .agg() method, similar to .groupby().
Let's take a closer look at the air quality history of NYC using the Ozone data you have seen before. The daily data are very volatile, so using a longer term rolling average can help reveal a longer term trend.
You'll be using a 360 day rolling window, and .agg() to calculate the rolling mean and standard deviation for the daily average ozone values since 2000.
This exercise is part of the course
Manipulating Time Series Data in Python
Exercise instructions
We have already imported pandas as pd, and matplotlib.pyplot as plt.
- Use
pd.read_csv()to import'ozone.csv', creating aDateTimeIndexfrom the'date'column usingparse_datesandindex_col, assign the result todata, and drop missing values using.dropna(). - Select the
'Ozone'column and create a.rolling()window using 360 periods, apply.agg()to calculate themeanandstd, and assign this torolling_stats. - Use
.join()to concatenatedatawithrolling_stats, and assign tostats. - Plot
statsusingsubplots.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Import and inspect ozone data here
data = ____
# Calculate the rolling mean and std here
rolling_stats = ____
# Join rolling_stats with ozone data
stats = ____
# Plot stats