Feature engineering
You are tasked with predicting whether or not a new cohort of loan applicants are likely to default on their loans. You have a historical dataset and wish to train a classifier on it. You notice that many features are in string format, which is a problem for your classifiers. You hence decide to encode the string columns numerically using LabelEncoder()
. The function has been preloaded for you from the preprocessing
submodule of sklearn
. The dataset credit
is also preloaded, as is a list of all column names whose data types are string, stored in non_numeric_columns
.
This exercise is part of the course
Designing Machine Learning Workflows in Python
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Inspect the first few lines of your data using head()
credit.____