CommencerCommencer gratuitement

Log normalization

Standardization is important to make sure all features are comparable. Log normalization is a common method of standardization. You will check the variance of select features and compute the overall median variance among features. The features will be the numeric ones, except for the click column, banner_pos, device_type, and columns search_engine_type, product_type, advertiser_type from last lesson since they are actually categorical columns. Then you will apply log normalization to these columns with a variance higher than the median variance and check results.

The pandas module is available as pd in your workspace and the sample DataFrame is loaded as df.

Cet exercice fait partie du cours

Predicting CTR with Machine Learning in Python

Afficher le cours

Exercice interactif pratique

Essayez cet exercice en complétant cet exemple de code.

# Select numeric columns and print variance
num_df = df.____(include=['int', 'float'])
filter_cols = ['click', 'banner_pos', 'device_type',
               'search_engine_type', 'product_type', 'advertiser_type']
new_df = num_df[num_df.columns[~num_df.columns.____(filter_cols)]]
median = new_df.____.____
print(median)
Modifier et exécuter le code