Model specification and estimation
You have seen the glm()
command for running a logistic regression. glm()
stands for generalized linear model and offers a whole family of regression models.
Take the exercise dataset for this coding task. The data defaultData
you need for this exercise is available in your environment and ready for modeling.
This exercise is part of the course
Machine Learning for Marketing Analytics in R
Exercise instructions
- Use the
glm()
function in order to model the probability that a customer will default on his payment by using a logistic regression. Include every explanatory variable of the dataset and specify the data that shall be used. - Do not forget to specify the argument
family
. - Extract the coefficients from the model, then transform them to the odds ratios and round.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Build logistic regression model
logitModelFull <- ___(PaymentDefault ~ limitBal + sex + education + marriage +
age + pay1 + pay2 + pay3 + pay4 + pay5 + pay6 + billAmt1 +
billAmt2 + billAmt3 + billAmt4 + billAmt5 + billAmt6 + payAmt1 +
payAmt2 + payAmt3 + payAmt4 + payAmt5 + payAmt6,
family = ___, data = ___)
# Take a look at the model
___(logitModelFull)
# Take a look at the odds ratios
coefsexp <- ___(logitModelFull) %>% ___ %>% round(2)
coefsexp