Visualizing patterns in the data
The first step before you start modeling is to explore your data. Let's start by examining your dataset and visualizing different patterns between fraudulent and regular samples. Exceptionally, you're going to build the visualization!
The dataset transfers
contains credit transfers and some of them were recorded as fraud. The column fraud_flag
indicates whether the transaction is fraudulent (fraud_flag = 1
) or not (fraud_flag = 0
). This dataset and the ggplot2
package are loaded in your workspace.
Este ejercicio forma parte del curso
Fraud Detection in R
Instrucciones del ejercicio
- Plot the column
amount
as the independent variable on the x axis, and the columnorig_balance_before
, which is the balance on the originator's account before booking the transfer, as the dependent variable on the y axis. - Color and shape the data based on the value in the
fraud_flag
column.
Ejercicio interactivo práctico
Prueba este ejercicio y completa el código de muestra.
# Make a scatter plot
ggplot(transfers, aes(x = ___, y = ___)) +
geom_point(aes(color = ___, shape = ___)) +
scale_color_manual(values = c('dodgerblue', 'red'))