Modeling grain yields
1. Modeling grain yields
So far in this chapter you've prepared your data and explored it with visualizations. The next step is to run a model and make predictions.2. geom_smooth
Here's the plot of corn yields that you drew in the previous exercise. The really important thing to notice is that the lines on the plot aren't straight. This means that we need to choose a model that handles nonlinear responses. Those smooth trend lines on the plot look like they give a reasonable fit, so that seems like a decent choice of model. geom_smooth creates the lines by using generalized additive models, or GAMs, so let's see how to fit one of those.3. Linear models vs. generalized additive models
Here's a code template for running a linear regression model. You call lm, passing a formula with the response on the left and the explanatory variables on the right, then set the data argument to the data frame containing the data. To run a GAM, you first load the mgcv package, which stands for "mixed GAM computational vehicle". After that the code only has two differences to the linear case. Rather than calling lm, you call gam. The other difference is that explanatory variables in the formula can be wrapped in s. The s function means "make the effect of this variable smooth". Very roughly, that means that you have a nonlinear response, but rather than aiming for a perfect fit to the data, you want something that changes gradually, giving a smooth line rather than a jagged line.4. Predicting GAMs
To create a data frame of model predictions, there are three steps. First, you need to specify a data frame of the cases that you want to predict. That is, each column of the data frame should contain values for an explanatory variable in the model. Second, you call predict, passing the GAM model, and the data frame of cases to predict. You also need to specify that the type of prediction is response. The predicted responses are given as a vector, but for convenience, it's useful to store them as a column in the data frame of cases to predict. You can mutate that data frame to add the column.5. Let's practice!
Almost there! Let's make some predictions!Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.