Separate numerical and categorical columns
In the last exercise, you have explored the dataset characteristics and are ready to do some data pre-processing. You will now separate categorical and numerical variables from the telco_raw
DataFrame with a customized categorical vs. numerical unique value count threshold. The pandas
module has been loaded for you as pd
.
The raw telecom churn dataset telco_raw
has been loaded for you as a pandas
DataFrame. You can familiarize with the dataset by exploring it in the console.
This is a part of the course
“Machine Learning for Marketing in Python”
Exercise instructions
- Store
customerID
andChurn
column names. - Assign to
categorical
the column names that have less than 5 unique values. - Remove
target
from the list. - Assign to
numerical
all column names that are not in thecustid
,target
andcategorical
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Store customerID and Churn column names
custid = ['___']
target = ['___']
# Store categorical column names
categorical = telco_raw.___()[telco_raw.nunique() < ___].keys().tolist()
# Remove target from the list of categorical variables
categorical.remove(___[0])
# Store numerical column names
numerical = [x for x in telco_raw.___ if x not in custid + ___ + categorical]
This exercise is part of the course
Machine Learning for Marketing in Python
From customer lifetime value, predicting churn to segmentation - learn and implement Machine Learning use cases for Marketing in Python.
In this chapter, you will explore the basics of machine learning methods used in marketing. You will learn about different types of machine learning, data preparation steps, and will run several end to end models to understand their power.
Exercise 1: Why use ML for marketing? Strategies and use casesExercise 2: Identify supervised learning examplesExercise 3: Supervised vs. unsupervised learningExercise 4: Preparation for modelingExercise 5: Investigate the dataExercise 6: Separate numerical and categorical columnsExercise 7: Encode categorical and scale numerical variablesExercise 8: ML modeling stepsExercise 9: Split data to training and testingExercise 10: Fit a decision treeExercise 11: Predict churn with decision treeWhat is DataCamp?
Learn the data skills you need online at your own pace—from non-coding essentials to data science and machine learning.