Session Ready
Exercise

Predicting from a trend surface

Your next task is to compute the pH at the locations that have missing data in the source. You can use the predict() function on the fitted model from the previous exercise for this.

Instructions
100 XP

The acidity survey data, ca_geo, and the linear model, m_trend have been pre-defined.

  • Construct a vector that is TRUE for the rows with missing pH values.
  • Take a subset of the data wherever the pH is missing, assigning the result to ca_geo_miss.
  • By default predict() will return predictions at all the original locations.
    • Pass the model as the first argument, as usual.
    • Pass ca_geo_miss to the newdata argument to predict missing values.
    • Assign the result to predictions.
  • Alkaline soils are those with a pH over 7. Our linear model gives us estimates and standard deviation based on a normal (Gaussian) assumption. Compute the probability of the soil being over 7 using pnorm() with the mean and standard deviation values from the prediction data.