Calculate and plot a seasonal average
In the previous exercise you used endpoints()
and period.apply()
to quickly calculate the win/loss average for the Boston Red Sox at the end of each season. But what if you need to know the cumulative average throughout each season? Statisticians and sports fans alike often rely on this average to compare a team with its rivals.
To calculate a cumulative average in each season, you'll need to return to the split-lapply-rbind formula practiced in Chapter Three. First, you'll split the data by season, then you'll apply a cumulative mean function to the win_loss
column in each season, then you'll bind the values back into an xts object.
A custom cummean()
function, which generates a cumulative sum and divides by the number of values included in the sum, has been generated for you. The redsox_xts
data, including the win_loss
column, is available in your workspace.
This exercise is part of the course
Case Study: Analyzing City Time Series Data in R
Exercise instructions
- Use
split()
to break up theredsox_xts
data into seasons (in this case,years
). Assign this toredsox_seasons
. - Use
lapply()
to calculate the cumulative mean for each season. For this exercise, acummean()
function has been designed which calculates the sum (usingcumsum()
) and divides by the number of entries in the sum (usingseq_along()
). Save this data toredsox_ytd
. - Use
do.call()
withrbind
to convert your list output to a single xts object (redsox_winloss
) which contains the win/loss average throughout each season. - Use
plot.xts()
to view the cumulative win/loss average during the2013
season. Leave theylim
argument as is in your prewritten code.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Split redsox_xts win_loss data into years
redsox_seasons <- split(___$___, f = "___")
# Use lapply to calculate the cumulative mean for each season
redsox_ytd <- lapply(___, cummean)
# Use do.call to rbind the results
redsox_winloss <- do.call(___, ___)
# Plot the win_loss average for the 2013 season
plot.xts(___["___"], ylim = c(0, 1))