Get startedGet started for free

Replace missing data - I

As you discovered in the previous exercise, your quarterly GDP data appear to be missing several observations. In fact, your call to summary() in the previous exercise revealed 80 missing data points!

As you may recall from the first xts course, xts and zoo provide a variety of functions to handle missing data.

The simplest technique is the na.locf() command, which carries forward the last observation before the missing data (hence, "last observation carried forward", or locf). This approach is often the most appropriate way to handle missingness, especially when you have reasons to be conservative about growth in your data.

A similar approach works in the opposite direction by taking the first observation after the missing value and carrying it backward ("next observation carried backward", or nocb). This technique can also be done using the na.locf() command by setting the fromLast argument to TRUE.

Which method is best depends on the type of data you are working with and your preconceived notions about how the data changes over time.

This exercise is part of the course

Case Study: Analyzing City Time Series Data in R

View Course

Exercise instructions

  • Use na.locf() to fill the missing values in gdp_xts based on the last observation carried forward. Save this new xts object as gdp_locf.
  • Use another call to na.locf() to fill missing values in gdp_xts based on the next observation carried backward. To do so, set the fromLast argument to TRUE. Save this new xts object as gdp_nocb.
  • Plot each of these objects using plot.xts(). Include the pre-written par() command to display both plots together.
  • Query each object (gdp_locf and gdp_nocb) for GDP in 1993.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Fill NAs in gdp_xts with the last observation carried forward
gdp_locf <- 

# Fill NAs in gdp_xts with the next observation carried backward 
gdp_nocb <- 

# Produce a plot for each of your new xts objects
par(mfrow = c(2,1))
plot.xts(___, major.format = "%Y")
plot.xts(___, major.format = "%Y")

# Query for GDP in 1993 in both gdp_locf and gdp_nocb

Edit and Run Code