Get startedGet started for free

Recency features

1. Recency features

In this lesson you'll learn how to engineer another kind of features, called recency features. Recency features capture the dimension of time in such a way that anomalous behavior of fraud cases is highlighted.

2. Authentication method vs time

As an example, we again focus on authentication method. This figure shows the time at which Alice made a transfer and the corresponding authentication method she used.

3. Large time interval

When the time-period between two consecutive transfers with the same authentication method is large, we say that the authentication method has not recently been used. In that case we set recency close to zero.

4. Small time interval

When the time-period between two consecutive transfers with the same authentication method is small, we say that the authentication method has recently been used. In that case we set recency close to one.

5. Zero recency

When an authentication method is used for the first time, we set its recency equal to zero.

6. Anomalous behavior

A zero or small recency could indicate anomalous behaviour.

7. Definition

We define recency as the exponential of minus gamma times t. Here t is the time-interval between two consecutive transfers in which, for example, the same authentication method was used. Gamma is a user-specified parameter and is typically rather small, for example 0.02. Notice that recency is always a number between 0 and 1.

8. Recency vs time

This figure shows that recency indeed decreases when the time-interval gets bigger. The parameter gamma determines how fast the recency decreases. For larger values of gamma, recency will decrease quicker with time and vice versa.

9. How to choose parameter $\gamma$?

A practical issue that remains is how to choose gamma? The parameter gamma is typically chosen such that recency has to be equal to 0.01 after 180 days. In that case gamma is -log(0.01)/180.

10. Recency feature in R (step 1)

We create a recency feature in R by first writing a function `recency_fun` that computes recency. The function takes `t`, `gamma`, `authentication_cd` and the frequency feature related to authentication_cd as inputs. If an authentication code has never been used, its frequency is 0, in which case the function should return 0. If an authentication code has already been used, compute the time difference between two consecutive transfers with the same authentication code. Next, compute recency as the exponential of minus gamma multiplied by the time difference, and return this reecny value.

11. Recency feature in R (step 2)

Next, you have to choose a value for gamma. By first grouping the data according to account_name and then adding the feature to the dataset with the function `mutate`, the recency feature is computed for all transfers at once. Use the function `rollapply` from the zoo package on the timestamp column such that our frequency_fun function is applied on each transfer consecutively. Make sure to specify parameter `width` as a list that starts with 0 until minus the length of the transfer_id column, and set parameter partial to TRUE.

12. Result!

This is the result! The recency feature is put in the second to last column. Notice that the last two fraudulent transfers have a recency of zero meaning that the authentication method has never been used before. So recency features can help detecting anomalies.

13. Features based on time, frequency and recency

Traditional features can now be combined with a new set of features based on customer spending history such as frequency, recency and timestamps. Adding these additional features may bring significant increase in model performance.

14. Let's practice!

Now let's try some examples.

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.