Label encoding
Let's work on categorical variables encoding. You will again work with a subsample from the House Prices Kaggle competition.
Your objective is to encode categorical features "RoofStyle" and "CentralAir" using label encoding. The train
and test
DataFrames are already available in your workspace.
Diese Übung ist Teil des Kurses
Winning a Kaggle Competition in Python
Anleitung zur Übung
- Concatenate
train
andtest
DataFrames into a single DataFramehouses
. - Create a
LabelEncoder
object without arguments and assign it tole
. - Create new label-encoded features for "RoofStyle" and "CentralAir" using the same
le
object.
Interaktive Übung
Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.
# Concatenate train and test together
houses = ____.____([train, test])
# Label encoder
from sklearn.preprocessing import LabelEncoder
le = ____()
# Create new features
houses['RoofStyle_enc'] = le.fit_transform(houses[____])
houses['CentralAir_enc'] = ____.____(____[____])
# Look at new features
print(houses[['RoofStyle', 'RoofStyle_enc', 'CentralAir', 'CentralAir_enc']].head())