LoslegenKostenlos loslegen

Creating training and test sets

You've just trained LogisticRegression() models on different columns.

You know that the data should be separated into training and test sets. test_train_split() is used to create both at the same time. The training set is used to make predictions, while the test set is used for evaluation. Without evaluating the model, you have no way to tell how well it will perform on new loan data.

In addition to the intercept_, which is an attribute of the model, LogisticRegression() models also have the .coef_ attribute. This shows how important each training column is for predicting the probability of default.

The data set cr_loan_clean is already loaded in the workspace.

Diese Übung ist Teil des Kurses

Credit Risk Modeling in Python

Kurs anzeigen

Anleitung zur Übung

  • Create the data set X using interest rate, employment length, and income. Create the y set using loan status.
  • Use train_test_split() to create the training and test sets from X and y.
  • Create and train a LogisticRegression() model and store it as clf_logistic.
  • Print the coefficients of the model using .coef_.

Interaktive Übung

Vervollständige den Beispielcode, um diese Übung erfolgreich abzuschließen.

# Create the X and y data sets
X = ____[[____,____,____]]
y = ____[[____]]

# Use test_train_split to create the training and test sets
X_train, X_test, y_train, y_test = ____(____, ____, test_size=.4, random_state=123)

# Create and fit the logistic regression model
____ = ____(solver='lbfgs').____(____, np.ravel(____))

# Print the models coefficients
print(____.coef_)
Code bearbeiten und ausführen