Imbalanced class distribution
The dataset transfers
contains credit transfers and some of them were recorded as fraud. The column fraud_flag
indicates whether the transaction is fraudulent (fraud_flag = 1
) or not (fraud_flag = 0
).
Since fraud is typically very rare, it is important to take the large imbalance between the number of fraudulent cases and regular cases into account. Let's check the fraction of legitimate and fraudulent cases and visualize the imbalance with a pie chart.
The dataset transfers
is loaded in your workspace. The visualization part has been defined for you, as data visualization in general is out of the scope of this course.
This exercise is part of the course
Fraud Detection in R
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Print the first 6 rows of the dataset
___(transfers)