Multivariate logistic regression
Generally, you won't use only loan_int_rate
to predict the probability of default. You will want to use all the data you have to make predictions.
With this in mind, try training a new model with different columns, called features, from the cr_loan_clean
data. Will this model differ from the first one? For this, you can easily check the .intercept_
of the logistic regression. Remember that this is the y-intercept of the function and the overall log-odds of non-default.
The cr_loan_clean
data has been loaded in the workspace along with the previous model clf_logistic_single
.
This exercise is part of the course
Credit Risk Modeling in Python
Exercise instructions
- Create a new
X
data set withloan_int_rate
andperson_emp_length
. Store it asX_multi
. - Create a
y
data set with justloan_status
. - Create and
.fit()
aLogisticRegression()
model on the newX
data. Store it asclf_logistic_multi
. - Print the
.intercept_
value of the model
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Create X data for the model
X_multi = ____[[____,____]]
# Create a set of y data for training
y = ____[[____]]
# Create and train a new logistic regression
clf_logistic_multi = ____(solver='lbfgs').____(____, np.ravel(____))
# Print the intercept of the model
print(____.____)