Encoding categorical variables
There are couple of columns in the UFO dataset that need to be encoded before they can be modeled through scikit-learn. You'll do that transformation here, using both binary and one-hot encoding methods.
Diese Übung ist Teil des Kurses
Preprocessing for Machine Learning in Python
Anleitung zur Übung
- Using
apply(), write a conditionallambdafunction that returns a1if the value is"us", else return 0. - Print out the number of
.unique()values in thetypecolumn. - Using
pd.get_dummies(), create a one-hot encoded set of thetypecolumn. - Finally, use
pd.concat()to concatenate thetype_setencoded variables to theufodataset.
Interaktive Übung
Vervollständige den Beispielcode, um diese Übung erfolgreich abzuschließen.
# Use pandas to encode us values as 1 and others as 0
ufo["country_enc"] = ufo["country"].____
# Print the number of unique type values
print(len(____.unique()))
# Create a one-hot encoded set of the type values
type_set = ____
# Concatenate this set back to the ufo DataFrame
ufo = pd.concat([____, ____], axis=1)