Dummy variables
In the last exercise of the course, you will prepare your data for modeling by dummy encoding your non-numeric columns.
For example, if you have a column of gender values, 'Male'
and 'Female'
, you want separate columns that tell you whether the observation is from a 'Male'
or a 'Female'
. This process of creating dummy variables is also called one-hot encoding.
You can use the get_dummies()
function from pandas to convert the non-numeric columns into dummy variables.
df_new = pd.get_dummies(df)
We've subsetted the flights
DataFrame to create flights_sub
to make it easier to see what is happening.
This exercise is part of the course
Python for R Users
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Look at the head of flights_sub
print(____)