Calculating RFM metrics
1. Calculating RFM metrics
In this lesson we will learn how to calculate Recency, Frequency and Monetary Value for each customer.2. Definitions
First step, let's nail down the definitions of the RFM values: Recency is just the number of days since the last transaction of the customer - the lower it is, the better, since every company wants its customers to be recent and active. Frequency calculates the number of transactions in the last 12 months, although there are variations such as average monthly transactions which depict the essence of this metric as well. And third, the monetary value is just the total value that the customer has spent with the company in the last 12 months. One comment though - the 12 months is a standard way to do this, but it can be chosen arbitrarily depending on the business model and the lifecycle of the products and customers.3. Dataset and preparations
As in the previous lessons, we will use the same online dataset. Now, we will do some data preparation before calculating the RFM values.4. Data preparation steps
The online dataset has already been pre-processed and only includes the recent 12 months of data. We can confirm that by viewing min() and max() of the InvoiceDate which you can see covers the full year. In the real world, we would be working with the most recent snapshot of the data of today or yesterday, but in this case the data comes from 2010 and 2011, so we have to create a hypothetical snapshot date that we'll use as a starting point to calculate metrics as if we're doing the analysis on the most recent data. So what we do is take the last InvoiceDate from the dataset and add one more day using the timedelta() function from datetime library. With days equal=1 argument we create a period of 1 day which we can then add to our date.5. Calculate RFM metrics
Now that we're done with preparations, we can finally calculate the RFM metrics. First, we aggregate the data on a Customer level, and calculate three metrics: we use the InvoiceDate and pass it to the lambda function, and then take a difference between our snapshot date - which would be today in the real world - and the most recent or max() invoice date. This gives us the number of days between hypothetical today and the last transaction. Then we count the invoices for our frequency metric, and sum all the spend that's recorded in the TotalSum variable. In the next step we rename the columns in the new dataframe for easier interpretation. And finally, let's view the result!6. Final RFM values
The result is a table which has a row for each customer with their recency, frequency and monetary value as of today, as if we were running the analysis the day after this data was pulled from the retailer's website which would be the case in the real world. This is all we need for our next steps in building powerful and intuitive RFM segments.7. Let's practice calculating RFM values!
Now, you will practice calculating these metrics on your own before we dig further into RFM segmentation!Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.