Frequency feature for one account
A frequency feature counts how frequently a certain event has happened in the past. Creating such features helps detecting anomalous behavior. In the video, you learned how to create a frequency feature based on a categorical feature.
You're now provided with transactional data from Bob. One of the columns is called channel_cd
which indicates the payment channel that Bob used to book each of his transactions. You'll be creating a frequency feature called freq_channel
based on the column channel_cd
using the function rollapply()
. You can use ?rollaply
in the console to see the function documentation.
The dataset trans_Bob
, the zoo
and dplyr
packages are loaded in your workspace.
This exercise is part of the course
Fraud Detection in R
Exercise instructions
- Write a function
frequency_fun()
which takessteps
andchannel
as inputs, counts the number of steps, and sums how often the latestchannel
has been used in the past. - Create the feature
freq_channel
by using the functionrollapply
on thetransfer_id
column. The feature should count how often a particularchannel_cd
has been used before. - Print the features
channel_cd
,freq_channel
andfraud_flag
. Inspect the newly created feature.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Frequency feature based on channel_cd
frequency_fun <- function(steps, channel) {
n <- ___(___)
frequency <- ___(___[1:n] == ___[___])
return(frequency)
}
# Create freq_channel feature
freq_channel <- ___(trans_Bob$___, width = list(-1:-length(trans_Bob$___)), partial = ___, FUN = ___, trans_Bob$___)
# Print the features channel_cd, freq_channel and fraud_flag next to each other
freq_channel <- c(0, freq_channel)
cbind.data.frame(trans_Bob$___, ___, trans_Bob$fraud_flag)