Exercise

# Missing values

Sometimes there are missing values in time series data, denoted `NA`

in R, and it is useful to know their locations. It is also important to know how missing values are handled by various R functions. Sometimes we may want to ignore any missingness, but other times we may wish to impute or estimate the missing values.

Let's again consider the monthly `AirPassengers`

dataset, but now the data for the year 1956 are missing. In this exercise, you'll explore the implications of this missing data and impute some new data to solve the problem.

The `mean()`

function calculates the sample mean, but it fails in the presence of any `NA`

values. Use `mean(___, na.rm = TRUE)`

to calculate the mean with all missing values removed. It is common to replace missing values with the mean of the observed values. Does this simple data imputation scheme appear adequate when applied the the `AirPassengers`

dataset?

Instructions

**100 XP**

- Use
`plot()`

to display a simple plot of`AirPassengers`

. Note the missing data for 1956. - Use
`mean()`

to calculate the sample mean of`AirPassengers`

with the missing data removed (`na.rm = TRUE`

). - Run the pre-written code to impute the mean values into your missing data.
- Use another call to
`plot()`

to replot your newly imputed`AirPassengers`

data. - Run the pre-written code to add the complete
`AirPassengers`

data to your plot.