Resampling techniques

In the last exercise, you saw how class imbalance can impact the results of your confusion matrix. In this exercise, you'll practice resampling techniques to explore the different results that alternative resampling styles can have on a dataset with class imbalance like that seen with loan_data. Using sklearn's resample() function, matching the number of rows in the majority class is called upsampling, while matching the number of rows in the minority class is called downsampling.

You will create both an upsampled and downsampled version of the loan_data dataset, apply a logistic regression on both of them and then evaluate your performance. The training data and its labels that correspond to deny are subset to contain only the minority class and to approve that correspond to the majority.

A train/test split testing object for making predictions has been saved to the workspace as X_test for your use in the exercises.

1
- Create an upsampled minority class the length of the majority class and concatenate (done for you).
- Create a downsampled majority class the length of the minority class and concatenate (done for you).

2
- Create an upsampled feature matrix and target array.
- Instantiate a logistic regression model object, fit, and predict with X_test.
- Print the evaluation metrics.
3
- Create a downsampled feature matrix and target array.
- Instantiate a logistic regression model object, fit, and predict with X_test.
- Print the evaluation metrics.

Data Pre-processing and Visualization

Supervised Learning

Unsupervised Learning

Model Selection and Evaluation

Exercise

Resampling techniques

Instructions 1/3