Get startedGet started for free

Feature engineering

You are tasked with predicting whether or not a new cohort of loan applicants are likely to default on their loans. You have a historical dataset and wish to train a classifier on it. You notice that many features are in string format, which is a problem for your classifiers. You hence decide to encode the string columns numerically using LabelEncoder(). The function has been preloaded for you from the preprocessing submodule of sklearn. The dataset credit is also preloaded, as is a list of all column names whose data types are string, stored in non_numeric_columns.

This exercise is part of the course

Designing Machine Learning Workflows in Python

View Course

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Inspect the first few lines of your data using head()
credit.____
Edit and Run Code