Encoding categorical variables - binary
Take a look at the hiking
dataset. There are several columns here that need encoding before they can be modeled, one of which is the Accessible
column. Accessible
is a binary feature, so it has two values, Y
or N
, which need to be encoded into 1's and 0's. Use scikit-learn's LabelEncoder
method to perform this transformation.
This exercise is part of the course
Preprocessing for Machine Learning in Python
Exercise instructions
- Store
LabelEncoder()
in a variable namedenc
. - Using the encoder's
.fit_transform()
method, encode thehiking
dataset's"Accessible"
column. Call the new columnAccessible_enc
. - Compare the two columns side-by-side to see the encoding.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Set up the LabelEncoder object
enc = ____
# Apply the encoding to the "Accessible" column
____ = ____.____(____)
# Compare the two columns
print(____[[____, ____]].head())