Exercise

Training a random forest with original features

In this exercise, we are going to train a random forest model using the original features from the credit card dataset. The goal is to detect new fraud instances in the future and we are doing that by learning the patterns of fraud instances in the balanced training set. Remember that a random forest can be trained with the following piece of code:

randomForest(x = features, y = label, ntree = 100)

The only pre-processing that has been done to the original features was to scale the Time and Amount variables. You have the balanced training dataset available in the environment as creditcard_train. The randomForest package has been loaded.

Instructions

100 XP
  • Fix the seed to 1234.
  • Separate the features and label of creditcard_train into train_x and train_y.
  • Train a random forest using the function randomForest() and 100 trees.
  • Plot the error evolution and the importance of the variables.