Predicting probability of default

All of the data processing is complete and it's time to begin creating predictions for probability of default. You want to train a LogisticRegression() model on the data, and examine how it predicts the probability of default.

So that you can better grasp what the model produces with predict_proba, you should look at an example record alongside the predicted probability of default. How do the first five predictions look against the actual values of loan_status?

The data set cr_loan_prep along with X_train, X_test, y_train, and y_test have already been loaded in the workspace.

Train a logistic regression model on the training data and store it as clf_logistic.
Use predict_proba() on the test data to create the predictions and store them in preds.
Create two data frames, preds_df and true_df, to store the first five predictions and true loan_status values.
Print the true_df and preds_df as one set using .concat().

Exploring and Preparing Loan Data

Logistic Regression for Defaults

Gradient Boosted Trees Using XGBoost

Model Evaluation and Implementation

Exercise

Predicting probability of default

Instructions