1. Learn
  2. /
  3. Courses
  4. /
  5. Designing Machine Learning Workflows in Python

Exercise

Categorical encodings

Your colleague has converted the columns in the credit dataset to numeric values using LabelEncoder(). He left one out: credit_history, which records the credit history of the applicant. You want to create two versions of the dataset. One will use LabelEncoder() and another one-hot encoding, for comparison purposes. The feature matrix is available to you as credit. You have LabelEncoder() preloaded and pandas as pd.

Instructions

100 XP
  • Encode credit_history using LabelEncoder().
  • Concatenate the result to the original frame.
  • Create a new data frame by concatenating the 1-hot encoding dummies to the original frame.
  • Confirm that 1-hot encoding produces more columns than label encoding.