CommencerCommencer gratuitement

Encode categorical and scale numerical variables

In this final step, you will perform one-hot encoding on the categorical variables and then scale the numerical columns. The pandas library has been loaded for you as pd, as well as the StandardScaler module from the sklearn.preprocessing module.

The raw telecom churn dataset telco_raw has been loaded for you as a pandas DataFrame, as well as the lists custid, target, categorical, and numerical with column names you have created in the previous exercise. You can familiarize yourself with the dataset by exploring it in the console.

Cet exercice fait partie du cours

Machine Learning for Marketing in Python

Afficher le cours

Instructions

  • Perform one-hot encoding on the categorical variables.
  • Initialize a StandardScaler instance.
  • Fit and transform the scaler on the numerical columns.
  • Build a DataFrame from scaled_numerical.

Exercice interactif pratique

Essayez cet exercice en complétant cet exemple de code.

# Perform one-hot encoding to categorical variables 
telco_raw = pd.get_dummies(data = ___, columns = categorical, drop_first=True)

# Initialize StandardScaler instance
scaler = ___()

# Fit and transform the scaler on numerical columns
scaled_numerical = ___.fit_transform(telco_raw[___])

# Build a DataFrame from scaled_numerical
scaled_numerical = pd.DataFrame(___, columns=numerical)
Modifier et exécuter le code