1. What is a rolling window?
Doing great so far!
2. Windows
In this chapter, we'll discuss some key applications of time series windows that can allow for more complex time series analysis and visualization.
Recall that a window is a range of observations defined by a start and end point. A window is like a computer monitor: if we visit a lengthy website, we can only view a limited range of information at once.
3. Windows
Windows let us zoom in on a range of data, filtering out the 'big picture' and focusing on the specific details. Here, our window ranges from 1992 to 1993 – it's a one-year window of the larger time series.
4. Global summary statistics
Time series analysis often involves determining statistics about data. Suppose we wanted to determine the average value of a time series, like our FTSE series. One approach is to take the 'global' mean of the entire dataset.
However, we end up with only a single value, which is useful but provides little information for how our data changes over time – a global statistic is a single value which stays constant over time.
5. Rolling window
Windows, however, provide a way to calculate statistics that 'move' with the data – this concept is called a 'rolling window'.
Let's make an example rolling window; a 30-day rolling average of our FTSE stock price.
At each observation, we create a window of a particular width, 30 days, then find the average of the values within that window. Doing this at each point in the data creates a rolling window; the result is how the average changes across time!
6. Rolling with zoo
To calculate a rolling window, we can use the zoo package, which offers 'rolling' versions of summary functions like mean, sum, and maximum. These functions each take a time series, and some additional arguments which control how the window is created at each point; let's check them out.
7. Window arguments
The argument k defines the width of the rolling window in terms of number of observations; if we wanted a seven-day rolling average of daily data, we would set k = 7.
align specifies whether the output of the function comes to the left, right, or in the middle of the rolling window; we'll give more detail on this momentarily!
The last argument is fill, which assigns values to observations outside the rolling window; it's usually best practice to set fill to NA.
In the output, the first six elements are NA. This is due to how we set the align and fill arguments.
8. NA values
When we create a rolling window, the first k - 1 values will be NA values, as there aren't enough observations to make a full window. Here, the first six are NA.
9. Window alignment
The align argument defines how the output of the rolling function is placed in relation to the window. For a right-aligned window, the mean is placed at the end point of the window; a right-aligned mean would take the average of the past observations. Here's an example time series, called data, with seven observations.
10. Window alignment
Left-alignment does the opposite; the rolling function calculates values in a window after the observation, so the output is to the left of the window at the start point.
11. Window alignment
In center-alignment, the output of the rolling function falls in the middle of the window.
The correct alignment to use depends on our data and the patterns we intend to show, but generally, right-alignment is the most common method.
12. Let's practice!
Let's roll on over to the exercises and put things into practice!