Pre-process RFM data
We have loaded the dataset with RFM values you calculated previously as datamart_rfm
. Since the variables are skewed and are on different scales, you will now un-skew and normalize them.
The pandas
library is loaded as pd
, and numpy
as np
. Take some time to explore the datamart_rfm
in the console.
This exercise is part of the course
Customer Segmentation in Python
Exercise instructions
- Apply log transformation to unskew the
datamart_rfm
and store it asdatamart_log
. - Initialize a
StandardScaler()
instance asscaler
and fit it on thedatamart_log
data. - Transform the
data
by scaling and centering it withscaler
. - Create a pandas DataFrame from 'datamart_normalized' by adding index and column names from
datamart_rfm
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Unskew the data
datamart_log = np.____(____)
# Initialize a standard scaler and fit it
scaler = ____()
scaler.____(____)
# Scale and center the data
datamart_normalized = ____.____(____)
# Create a pandas DataFrame
datamart_normalized = pd.____(data=____, index=____.index, columns=____.columns)