1. Expanding windows
Great progress!
2. Rolling versus expanding windows
In addition to rolling windows, there are also 'expanding windows' in time series analysis.
An expanding window is similar to a rolling window, but its start point is fixed at the beginning of the data, rather than moving along with the window.
This means that, whereas a rolling window has a fixed width, for example seven days, an expanding window has a dynamic, increasing width.
Expanding windows let us calculate statistics like the mean, maximum, sum, etc, up to each observation in the data. Let's look at how an expanding window works.
3. Expanding window process
Here's a conceptual expanding window underneath a time series, with observations one, two, three, and four.
At the first observation, the width of the window is one. At the second, the width is two, and so on. Across the time series, an expanding window has a width at each observation equal to the number of that observation. Expanding windows answer the question, "what is the summary of our available data at each point in time?"
Of note is that the starting point of an expanding window is fixed; all four 'windows' in the diagram start at the first observation.
4. Calculating an expanding window
We have a concept for creating an expanding window, so how do we calculate one in R?
The good news is that we already know the function that can calculate expanding windows – rollapply!
As we've used it, rollapply takes a single number for the width argument, but we can also assign it a vector of widths that change as it moves across the data.
If the widths of an expanding window increase by one for each element in the time series, we need a function that can output a sequence of numbers, with a total length equal to the number of observations in the time series.
Base R has the function seq_along, or "sequence-along", that returns a vector of numbers increasing by one, for the length of the input object. It's the perfect function for returning an increasing window width for our expanding windows!
5. Calculating an expanding window
Let's assign the sequence of widths, made using seq_along on the daily_temp time series, to a variable exp_widths, or 'expanding widths'.
We can use exp_widths within a call to rollapply, using much the same syntax as when we create a rolling window. When creating an expanding window, we must set the align argument to 'right'.
6. Plotting expanding windows
Let's plot everything all together. We can plot our original time series, daily_temp, in light gray, with the expanding window overlaid in red, using the geom_line syntax.
The further along in an expanding window, the more values there are to summarize. An expanding mean, like this, approaches the overall mean of the dataset – the final calculation of an expanding mean is the same as the global mean, because the window covers the entire dataset.
7. Expanding window inferences
There are some properties we can infer about expanding windows that are important to consider when performing a proper time series analysis.
As mentioned in the example of the expanding mean, the statistics from an expanding window approach the global statistics of the time series. The 'further along' in an expanding window, the closer the value is to the overall statistic.
Because an expanding window considers more and more values as it progresses, a single outlier will have less of an effect further along in the expanding mean, and more of an effect closer to the beginning. Notice how the graph seems to rapidly 'smooth out' after the first few months?
8. Let's practice!
Alright, let's expand our time series skills and practice creating expanding windows in the exercises!