Session Ready
Exercise

Model adjustments

A simple way to adjust the random forest model to deal with highly imbalanced fraud data, is to use the class_weights option when defining your sklearn model. However, as you will see, it is a bit of a blunt force mechanism and might not work for your very special case.

In this exercise you'll explore the weight = "balanced_subsample" mode the Random Forest model from the earlier exercise. You already have split your data in a training and test set, i.e X_train, X_test, y_train, y_test are available. The metrics function have already been imported.

Instructions
100 XP
  • Set the class_weight argument of your classifier to balanced_subsample.
  • Fit your model to your training set.
  • Obtain predictions and probabilities from X_test.
  • Obtain the roc_auc_score, the classification report and confusion matrix.