Encode categorical and scale numerical variables
In this final step, you will perform one-hot encoding on the categorical variables and then scale the numerical columns. The pandas
library has been loaded for you as pd
, as well as the StandardScaler
module from the sklearn.preprocessing
module.
The raw telecom churn dataset telco_raw
has been loaded for you as a pandas
DataFrame, as well as the lists custid
, target
, categorical
, and numerical
with column names you have created in the previous exercise. You can familiarize yourself with the dataset by exploring it in the console.
This is a part of the course
“Machine Learning for Marketing in Python”
Exercise instructions
- Perform one-hot encoding on the categorical variables.
- Initialize a
StandardScaler
instance. - Fit and transform the
scaler
on the numerical columns. - Build a DataFrame from
scaled_numerical
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Perform one-hot encoding to categorical variables
telco_raw = pd.get_dummies(data = ___, columns = categorical, drop_first=True)
# Initialize StandardScaler instance
scaler = ___()
# Fit and transform the scaler on numerical columns
scaled_numerical = ___.fit_transform(telco_raw[___])
# Build a DataFrame from scaled_numerical
scaled_numerical = pd.DataFrame(___, columns=numerical)
This exercise is part of the course
Machine Learning for Marketing in Python
From customer lifetime value, predicting churn to segmentation - learn and implement Machine Learning use cases for Marketing in Python.
In this chapter, you will explore the basics of machine learning methods used in marketing. You will learn about different types of machine learning, data preparation steps, and will run several end to end models to understand their power.
Exercise 1: Why use ML for marketing? Strategies and use casesExercise 2: Identify supervised learning examplesExercise 3: Supervised vs. unsupervised learningExercise 4: Preparation for modelingExercise 5: Investigate the dataExercise 6: Separate numerical and categorical columnsExercise 7: Encode categorical and scale numerical variablesExercise 8: ML modeling stepsExercise 9: Split data to training and testingExercise 10: Fit a decision treeExercise 11: Predict churn with decision treeWhat is DataCamp?
Learn the data skills you need online at your own pace—from non-coding essentials to data science and machine learning.