Session Ready
Exercise

One-hot encoding

The discipline_logs dataset is loaded in your workspace. These data contain information on student discipline events that occurred during a school day. It contains an assortment of variable types including string variables with various categories. Since most machine learning algorithms cannot interpret this kind of information, we have to encode them as numerical features. One common practice previously discussed is one-hot encoding, in which each row of the column contains zeros, except for the rows that correspond to the specific category, which is set to one.

Instructions
100 XP
  • Load the dplyr library.
  • Create a male column that encodes each row corresponding to male as 1 and set everything else to 0.
  • Create a female column that encodes each row corresponding to female as 1 and set everything else to 0.