Exercise

Logistic Regression

In this last lesson you'll combine three algorithms into one model with the VotingClassifier. This allows us to benefit from the different aspects from all models, and hopefully improve overall performance and detect more fraud. The first model, the Logistic Regression, has a slightly higher recall score than our optimal Random Forest model, but gives a lot more false positives. You'll also add a Decision Tree with balanced weights to it. The data is already split into a training and test set, i.e. X_train, y_train, X_test, y_test are available.

In order to understand how the Voting Classifier can potentially improve your original model, you should check the standalone results of the Logistic Regression model first.

Instructions

100 XP
  • Define a LogisticRegression model with class weights that are 1:15 for the fraud cases.
  • Fit the model to the training set, and obtain the model predictions.
  • Print the classification report and confusion matrix.