Get startedGet started for free

Train and testing transformations (I)

So far you have created scalers based on a column, and then applied the scaler to the same data that it was trained on. When creating machine learning models you will generally build your models on historic data (train set) and apply your model to new unseen data (test set). In these cases you will need to ensure that the same scaling is being applied to both the training and test data.
To do this in practice you train the scaler on the train set, and keep the trained scaler to apply it to the test set. You should never retrain a scaler on the test set.

For this exercise and the next, we split the so_numeric_df DataFrame into train (so_train_numeric) and test (so_test_numeric) sets.

This exercise is part of the course

Feature Engineering for Machine Learning in Python

View Course

Exercise instructions

  • Instantiate the StandardScaler() as SS_scaler.
  • Fit the StandardScaler on the Age column.
  • Transform the Age column in the test set (so_test_numeric).

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Import StandardScaler
from sklearn.preprocessing import StandardScaler

# Apply a standard scaler to the data
SS_scaler = ____

# Fit the standard scaler to the data
____

# Transform the test data using the fitted scaler
so_test_numeric['Age_ss'] = ____
print(so_test_numeric[['Age', 'Age_ss']].head())
Edit and Run Code