Get startedGet started for free

Random Forest Classifier - part 1

Let's now create a first random forest classifier for fraud detection. Hopefully you can do better than the baseline accuracy you've just calculated, which was roughly 96%. This model will serve as the "baseline" model that you're going to try to improve in the upcoming exercises. Let's start first with splitting the data into a test and training set, and defining the Random Forest model. The data available are features X and labels y.

This exercise is part of the course

Fraud Detection in Python

View Course

Exercise instructions

  • Import the random forest classifier from sklearn.
  • Split your features X and labels y into a training and test set. Set aside a test set of 30%.
  • Assign the random forest classifier to model and keep random_state at 5. We need to set a random state here in order to be able to compare results across different models.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Import the random forest model from sklearn
from sklearn.ensemble import ____

# Split your data into training and test set
X_train, X_test, y_train, y_test = ____(____, ____, test_size=____, random_state=0)

# Define the model as the random forest
model = ____(random_state=5)
Edit and Run Code