ComenzarEmpieza gratis

Separate numerical and categorical columns

In the last exercise, you have explored the dataset characteristics and are ready to do some data pre-processing. You will now separate categorical and numerical variables from the telco_raw DataFrame with a customized categorical vs. numerical unique value count threshold. The pandas module has been loaded for you as pd.

The raw telecom churn dataset telco_raw has been loaded for you as a pandas DataFrame. You can familiarize with the dataset by exploring it in the console.

Este ejercicio forma parte del curso

Machine Learning for Marketing in Python

Ver curso

Instrucciones del ejercicio

  • Store customerID and Churn column names.
  • Assign to categorical the column names that have less than 5 unique values.
  • Remove target from the list.
  • Assign to numerical all column names that are not in the custid, target and categorical.

Ejercicio interactivo práctico

Prueba este ejercicio y completa el código de muestra.

# Store customerID and Churn column names
custid = ['___']
target = ['___']

# Store categorical column names
categorical = telco_raw.___()[telco_raw.nunique() < ___].keys().tolist()

# Remove target from the list of categorical variables
categorical.remove(___[0])

# Store numerical column names
numerical = [x for x in telco_raw.___ if x not in custid + ___ + categorical]
Editar y ejecutar código