1. Learn
  2. /
  3. Courses
  4. /
  5. Preprocessing for Machine Learning in Python

Connected

Exercise

Encoding categorical variables

There are couple of columns in the UFO dataset that need to be encoded before they can be modeled through scikit-learn. You'll do that transformation here, using both binary and one-hot encoding methods.

Instructions

100 XP
  • Using apply(), write a conditional lambda function that returns a 1 if the value is "us", else return 0.
  • Print out the number of .unique() values in the type column.
  • Using pd.get_dummies(), create a one-hot encoded set of the type column.
  • Finally, use pd.concat() to concatenate the type_set encoded variables to the ufo dataset.