Creating training and test sets
You've just trained LogisticRegression() models on different columns.
You know that the data should be separated into training and test sets. test_train_split() is used to create both at the same time. The training set is used to make predictions, while the test set is used for evaluation. Without evaluating the model, you have no way to tell how well it will perform on new loan data.
In addition to the intercept_, which is an attribute of the model, LogisticRegression() models also have the .coef_ attribute. This shows how important each training column is for predicting the probability of default.
The data set cr_loan_clean is already loaded in the workspace.
Diese Übung ist Teil des Kurses
Credit Risk Modeling in Python
Anleitung zur Übung
- Create the data set
Xusing interest rate, employment length, and income. Create theyset using loan status. - Use
train_test_split()to create the training and test sets fromXandy. - Create and train a
LogisticRegression()model and store it asclf_logistic. - Print the coefficients of the model using
.coef_.
Interaktive Übung
Vervollständige den Beispielcode, um diese Übung erfolgreich abzuschließen.
# Create the X and y data sets
X = ____[[____,____,____]]
y = ____[[____]]
# Use test_train_split to create the training and test sets
X_train, X_test, y_train, y_test = ____(____, ____, test_size=.4, random_state=123)
# Create and fit the logistic regression model
____ = ____(solver='lbfgs').____(____, np.ravel(____))
# Print the models coefficients
print(____.coef_)