Session Ready
Exercise

Signal variance

You're now ready to compare the 2011 weather data with the 30-year normals reported in 2010. You can ask questions such as, on average, how much hotter was every day in 2011 than expected from the 30-year average?

The DataFrames df_clean and df_climate from previous exercises are available in the workspace.

Your job is to first resample df_clean and df_climate by day and aggregate the mean temperatures. You will then extract the temperature related columns from each - 'dry_bulb_faren' in df_clean, and 'Temperature' in df_climate - as NumPy arrays and compute the difference.

Notice that the indexes of df_clean and df_climate are not aligned - df_clean has dates in 2011, while df_climate has dates in 2010. This is why you extract the temperature columns as NumPy arrays. An alternative approach is to use the pandas .reset_index() method to make sure the Series align properly. You will practice this approach as well.

Instructions
100 XP
  • Downsample df_clean with daily frequency and aggregate by the mean. Store the result as daily_mean_2011.
  • Extract the 'dry_bulb_faren' column from daily_mean_2011 as a NumPy array using .values. Store the result as daily_temp_2011. Note: .values is an attribute, not a method, so you don't have to use ().
  • Downsample df_climate with daily frequency and aggregate by the mean. Store the result as daily_climate.
  • Reset the index of daily_climate and extract the Temperature column. To do this, first reset the index of daily_climate using the .reset_index() method, and then use bracket slicing to access 'Temperature'. Store the result as daily_temp_climate.