Normalize the variables
Now, for the last step in data preparation. You will transform the unskewed dataset wholesale_boxcox
to the same scale, meaning all columns have a mean of zero, and standard deviation of 1. You will use the StandardScaler
function from the sklearn.preprocessing
module.
The unskewed wholesale_coxbox
dataset you have transformed in the previous exercise has been imported as a pandas
DataFrame. Also, the StandardScaler()
instance has been initialized as scaler
.
This exercise is part of the course
Machine Learning for Marketing in Python
Exercise instructions
- Fit the initialized
scaler
instance on the Box-Cox transformed dataset. - Transform and store the scaled dataset as
wholesale_scaled
. - Create a
pandas
DataFrame from the scaled dataset. - Print the mean and standard deviation for all columns.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Fit the initialized `scaler` instance on the Box-Cox transformed dataset
scaler.___(wholesale_boxcox)
# Transform and store the scaled dataset as `wholesale_scaled`
wholesale_scaled = scaler.___(wholesale_boxcox)
# Create a `pandas` DataFrame from the scaled dataset
wholesale_scaled_df = pd.DataFrame(data=___,
index=wholesale_boxcox.___,
columns=wholesale_boxcox.columns)
# Print the mean and standard deviation for all columns
print(wholesale_scaled_df.agg(['___','std']).round())