How does loan purpose affect amount funded?
In the last exercise, we pared the purpose
variable down to a more reasonable 4 categories and called it purpose_recode
. As a data scientist at Lending Club, we might want to design an experiment where we examine how the loan purpose influences the amount funded, which is the money actually issued to the applicant.
Remember that for an ANOVA test, the null hypothesis will be that all of the mean funded amounts are equal across the levels of purpose_recode
. The alternative hypothesis is that at least one level of purpose_recode
has a different mean. We will not be sure which, however, without some post hoc analysis, so it will be helpful to know how ANOVA results get stored as an object in R.
This exercise is part of the course
Experimental Design in R
Exercise instructions
- Use
lm()
to look at how thepurpose_recode
variable affectsfunded_amnt
. Save the model as an object calledpurpose_recode_model
. - Use
summary()
to examinepurpose_recode_model
. These are the results of the linear regression. - Call
anova()
onpurpose_recode_model
. Save as an object calledpurpose_recode_anova
. Print it to the console by typing it. - Finally, examine the class of
purpose_recode_anova
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Build a linear regression model, purpose_recode_model
___ <- lm(funded_amnt ~ ___, data = ___)
# Examine results of purpose_recode_model
___(purpose_recode_model)
# Get anova results and save as purpose_recode_anova
___ <- anova(___)
# Print purpose_recode_anova
___
# Examine class of purpose_recode_anova
class(___)