Get startedGet started for free

Time features

1. Time features

Time is an important aspect in fraud detection. You will learn about tools to analyze timestamps and create features that indicate suspicious behavior.

2. Analyzing time

It is expected that events, like a customer who makes transactions, happen at similar hours. The goal is to create features that capture information about the time aspect of events. Since a clock represents a circle, timestamps and similar data are a bit different from regular data. For example, when analyzing the average timestamp, it is easy to make the mistake of using the arithmetic mean. The arithmetic mean does not take into account the periodic behavior of time.

3. Mean of timestamps

In this example, the arithmetic mean is 11:30 which makes no sense. Digital times can be converted to decimals by first using the function hms from the lubridate package, and then transforming them to numeric values, where we divide by 3600 to get hourly timestamps.

4. Circular histogram

A histogram of timestamps is called a circular histogram and can be made with the function coord_polar from ggplot. The arithmetic mean is added as a vertical line.

5. Circular histogram with arithmetic mean

You can see that the arithmetic mean is not even close to the average time.

6. von Mises probability distribution

We can model a timestamp as a periodic variable by using the von Mises distribution. The von Mises distribution is a probability distribution of a normal distributed variable wrapped across a circle. The von Mises distribution is defined by two parameters mu and kappa. Mu is the periodic mean and kappa is a measure of concentration such that 1/kappa is the periodic variance.

7. Estimate parameters $\mu$ and $\kappa$

Before estimating mu and kappa, we first convert the timestamps to class circular and specify the units as hours and the template as a 24-hour clock. Mu and kappa can then be estimated with the function mle.vonmises from the circular package. Notice that we use modulo 24 to ensure that the parameter mu is positive.

8. Circular histogram with periodic mean

This circular histogram shows the periodic mean as estimated by the parameter mu.

9. Confidence interval

Using the estimated von Mises distribution, a new set of features can be obtained, such as a binary feature indicating if a timestamp lies inside the confidence interval. Suppose we have a set of timestamps of transactions made by the same customer. We can estimate the parameters mu and kappa of the von Mises distribution on that set. Next, we compute the density or likelihood of each timestamp by using the function dvonmises.

10. Feature extraction

A timestamp lies in the confidence interval if its likelihood is larger than a certain threshold. Suppose we want a confidence interval with 90% probability. Then we set alpha equal to 0.9 and compute the cutoff value with the functions dvnonmises and qvnonmises on the value (1 - alpha) / 2. The binary true or false feature tells us which timestamps have a density that is larger than the cutoff and therefore lie inside the confidence interval.

11. Confidence interval

This circular plot shows the 90% confidence interval.

12. Confidence interval

If a transaction has a timestamp falling outside the confidence interval, this transaction can be considered as an anomaly.

13. Example

We can calculate a binary feature that takes the value TRUE if the time of the current transaction is within the 90% confidence interval of the previous timestamps.

14. Confidence interval with moving time window

Parameters mu and kappa are estimated after the second transaction. The historic dataset contains all but the current timestamp. After estimating mu and kappa, the likelihood of the current timestamp is computed. Finally, the cutoff value with the desired confidence level is calculated and the binary feature records whether the density of the current timestamp is larger than this cutoff or not.

15. Let's practice!

Now it's your turn to plot circular histograms and compute time features!

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.