Logistic regression basics
You've now cleaned up the data and created the new data set cr_loan_clean
.
Think back to the final scatter plot from chapter 1 which showed more defaults with high loan_int_rate
. Interest rates are easy to understand, but what how useful are they for predicting the probability of default?
Since you haven't tried predicting the probability of default yet, test out creating and training a logistic regression model with just loan_int_rate
. Also check the model's internal parameters, which are like settings, to see the structure of the model with this one column.
The data cr_loan_clean
has already been loaded in the workspace.
This exercise is part of the course
Credit Risk Modeling in Python
Exercise instructions
- Create the
X
andy
sets using theloan_int_rate
andloan_status
columns. - Create and fit a logistic regression model to the training data and call it
clf_logistic_single
. - Print the parameters of the model with
.get_params()
. - Check the intercept of the model with the
.intercept_
attribute.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Create the X and y data sets
X = ____[[____]]
y = ____[[____]]
# Create and fit a logistic regression model
____ = ____()
clf_logistic_single.____(X, np.ravel(____))
# Print the parameters of the model
print(____.____())
# Print the intercept of the model
print(____.____)