Get startedGet started for free

Calculate and plot a seasonal average

In the previous exercise you used endpoints() and period.apply() to quickly calculate the win/loss average for the Boston Red Sox at the end of each season. But what if you need to know the cumulative average throughout each season? Statisticians and sports fans alike often rely on this average to compare a team with its rivals.

To calculate a cumulative average in each season, you'll need to return to the split-lapply-rbind formula practiced in Chapter Three. First, you'll split the data by season, then you'll apply a cumulative mean function to the win_loss column in each season, then you'll bind the values back into an xts object.

A custom cummean() function, which generates a cumulative sum and divides by the number of values included in the sum, has been generated for you. The redsox_xts data, including the win_loss column, is available in your workspace.

This exercise is part of the course

Case Study: Analyzing City Time Series Data in R

View Course

Exercise instructions

  • Use split() to break up the redsox_xts data into seasons (in this case, years). Assign this to redsox_seasons.
  • Use lapply() to calculate the cumulative mean for each season. For this exercise, a cummean() function has been designed which calculates the sum (using cumsum()) and divides by the number of entries in the sum (using seq_along()). Save this data to redsox_ytd.
  • Use do.call() with rbind to convert your list output to a single xts object (redsox_winloss) which contains the win/loss average throughout each season.
  • Use plot.xts() to view the cumulative win/loss average during the 2013 season. Leave the ylim argument as is in your prewritten code.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Split redsox_xts win_loss data into years 
redsox_seasons <- split(___$___, f = "___")

# Use lapply to calculate the cumulative mean for each season
redsox_ytd <- lapply(___, cummean)

# Use do.call to rbind the results
redsox_winloss <- do.call(___, ___)

# Plot the win_loss average for the 2013 season
plot.xts(___["___"], ylim = c(0, 1))
Edit and Run Code