Creating dummies from a many-categories variable
Given is a basetable
with one predictive variable "country". Make sure that "country" can be used as a predictive variable in a logistic regression model by creating dummy variables for it.
Diese Übung ist Teil des Kurses
Intermediate Predictive Analytics in Python
Anleitung zur Übung
- Create a pandas dataframe
dummies_country
that has the dummy variables for "country". Make sure you avoid multicollinearity. - Add these dummies to the original
basetable
. - Remove the original variable "country" from the
basetable
.
Interaktive Übung
Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.
# Create the dummy variable
dummies_country = ____.____(____["____"], ____=____)
# Add the dummy variable to the basetable
basetable = ____.____([____, ____], ____=____)
# Delete the original variable from the basetable
____ ____["____"]
print(basetable.head())