Creating dummy variables
Being able to include categorical features in the model building process can enhance performance as they may add information that contributes to prediction accuracy.
The music_df dataset has been preloaded for you, and its shape is printed. Also, pandas has been imported as pd.
Now you will create a new DataFrame containing the original columns of music_df plus dummy variables from the "genre" column.
This exercise is part of the course
Supervised Learning with scikit-learn
Exercise instructions
- Use a relevant function, passing the entire
music_dfDataFrame, to createmusic_dummies, dropping the first binary column. - Print the shape of
music_dummies.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Create music_dummies
music_dummies = ____
# Print the new DataFrame's shape
print("Shape of music_dummies: {}".format(____))