Normalization
As discussed in the video, in normalization you linearly scale the entire column between 0 and 1, with 0 corresponding with the lowest value in the column, and 1 with the largest.
When using scikit-learn (the most commonly used machine learning library in Python) you can use a MinMaxScaler
to apply normalization.
(It is called this as it scales your values between a minimum and maximum value.)
This exercise is part of the course
Feature Engineering for Machine Learning in Python
Exercise instructions
- Import
MinMaxScaler
fromsklearn
'spreprocessing
module. - Instantiate the
MinMaxScaler()
asMM_scaler
. - Fit the
MinMaxScaler
on theAge
column ofso_numeric_df
. - Transform the same column with the scaler you just fit.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Import MinMaxScaler
____
# Instantiate MinMaxScaler
MM_scaler = ____()
# Fit MM_scaler to the data
____.____(so_numeric_df[['Age']])
# Transform the data using the fitted scaler
so_numeric_df['Age_MM'] = ____.____(so_numeric_df[['Age']])
# Compare the origional and transformed column
print(so_numeric_df[['Age_MM', 'Age']].head())