Handle missing values
In chapter 3, you used na.locf()
to fill missing values with the previous non-missing value. You can use interpolation when carrying the previous value forward isn't appropriate. In this exercise, you will explore two interpolation methods: linear and spline.
Linear interpolation calculates values that lie on a line between two known data points. This is a good choice for fairly linear data, like a series with a strong trend. Spline interpolation is more appropriate for series without a strong trend, because it calculates a non-linear approximation using multiple data points.
Use these two methods to interpolate the three missing values for the 10-year Treasury rate in the object DGS10
. Then compare the results with the output of na.locf()
.
This is a part of the course
“Importing and Managing Financial Data in R”
Exercise instructions
- Complete the command to use
na.approx()
to fill in missing values using linear interpolation. - Complete the command to use
na.spline()
to fill in missing values using spline interpolation. - Merge
locf
,approx
, andspline
into one object namedna_filled
. - Complete the command to plot
na_filled
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# fill NA using last observation carried forward
locf <- na.locf(DGS10)
# fill NA using linear interpolation
approx <- ___(DGS10)
# fill NA using spline interpolation
spline <- ___(DGS10)
# merge into one object
# plot combined object
___(___, col = c("black", "red", "green"))