Combining ROS & RUS
You can combine both random over-sampling (ROS) and random under-sampling (RUS) in order to balance the class distribution. You're going to re-balance the dataset such that the new dataset contains 10,000 transactions of which 30% are fraudulent.
Remember, you can always load ROSE in the console and enter ?ovun.sample
to check which arguments the function takes.
This exercise is part of the course
Fraud Detection in R
Exercise instructions
- Load the
ROSE
package. - Set
n_new
equal to 10,000 andfraud_fraction
to 30%. - Use both over and under-sampling.
- Check the class-balance of the under-sampled dataset.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Load ROSE
___
# Specify the desired number of cases in the balanced dataset and the fraction of fraud cases
n_new <- ___
fraud_fraction <- ___
# Combine ROS & RUS!
sampling_result <- ___(___ = ___, ___ = ___,
___ = ___, ___ = ___, p = ___, seed = 2018)
# Verify the Class-balance of the re-balanced dataset
sampled_credit <- ___
prop.table(___(___))