Get startedGet started for free

How many extensions are needed?

1. How many extensions are needed?

In this lesson we discuss criteria concerning the model’s goodness of fit and the significance of the estimated coefficients.

2. Summarizing the model

One of our main takeaways so far is that a model fits the data well if the differences between the observed and the predicted values are small. The R-squared quantifies these differences. The R-squared measure can be obtained by applying the function summary() on the extended-dot-model object. The summary-function compactly displays all the information required for evaluating the results obtained from the extended-dot-model object. At the bottom of the output, we find the R-squared measure. The R-squared measures the proportion of variation in the outcome explained by the model and a value of zero would indicate, that the model explains nothing of the variability of the outcome. In our example, the R-squared states that the extended-dot-model explains 71 percent of the variability in sales. Which is really good. However, our estimated model includes a large number of predictors. Therefore, the Adjusted R-squared is a better measure as the Adjusted R-squared penalizes the model for the inclusion of irrelevant predictors by a penalty term. With 69 percent the Adjusted R-squared of our model is only a little bit lower than the unadjusted R-squared. This indicates we might not have any irrelevant predictors in the model.

3. Statistical significance

The decision whether or not a predictor is irrelevant and should be excluded from the model can be based on the t-test statistic and the corresponding P-value columns. A P-value lower than a certain predetermined critical value indicates that the predictor is likely to make a meaningful contribution to the model fit. For most applications, 0-point-05 is a widely used critical value. In our example, the lagged Coupon predictor does have only a small effect on changes in SALES because it's value is larger than the critical value of 0-point-05. Therefore, based on the P-value, one would exclude the lagged COUPON predictor.

4. Dropping predictors

One can also drop single predictors sequentially from the model and then test for inclusion by comparing the goodness of fit of the reduced and the extended model. Dropping of single predictors can be done by using the function update(). We just use a minus sign in front of the predictor that should be excluded. The model fit of the reduced and the full extended model is then evaluated by the Akaike Information Criteria (or short AIC), which can be obtained by the function AIC(). Much like adjusted R-squared, it’s intent is to prevent you from including irrelevant predictors - whereas lower AIC values indicate better model fit. Therefore we evaluate the reduced model against the full extended models by it's increase or decrease in AIC. In our example, the AIC value decreases indicating the lagged coupon variable does only bring little additional explanatory power.

5. Elimination predictors

We can also do the model selection in a more data-driven way by performing stepwise selection. The package MASS offers the function stepAIC() which chooses the best fitting model by its AIC. The argument direction, allows us to perform backward selection on the predictors in the extended model - similar like before - by starting with the full model and excluding predictors sequentially. The model having the smallest AIC, which is returned by the summary function, is the model excluding the lagged COUPON predictor.

6. Let's practice!

Now, let’s go and crush some models.