Creating dummy variables
Being able to include categorical features in the model building process can enhance performance as they may add information that contributes to prediction accuracy.
The music_df
dataset has been preloaded for you, and its shape is printed. Also, pandas
has been imported as pd
.
Now you will create a new DataFrame containing the original columns of music_df
plus dummy variables from the "genre"
column.
This exercise is part of the course
Supervised Learning with scikit-learn
Exercise instructions
- Use a relevant function, passing the entire
music_df
DataFrame, to createmusic_dummies
, dropping the first binary column. - Print the shape of
music_dummies
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Create music_dummies
music_dummies = ____
# Print the new DataFrame's shape
print("Shape of music_dummies: {}".format(____))