Get startedGet started for free

Binarizing columns

While numeric values can often be used without any feature engineering, there will be cases when some form of manipulation can be useful. For example on some occasions, you might not care about the magnitude of a value but only care about its direction, or if it exists at all. In these situations, you will want to binarize a column. In the so_survey_df data, you have a large number of survey respondents that are working voluntarily (without pay). You will create a new column titled Paid_Job indicating whether each person is paid (their salary is greater than zero).

This exercise is part of the course

Feature Engineering for Machine Learning in Python

View Course

Exercise instructions

  • Create a new column called Paid_Job filled with zeros.
  • Replace all the Paid_Job values with a 1 where the corresponding ConvertedSalary is greater than 0.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Create the Paid_Job column filled with zeros
so_survey_df[____] = ____

# Replace all the Paid_Job values where ConvertedSalary is > 0
so_survey_df.____[____, 'Paid_Job'] = 1

# Print the first five rows of the columns
print(so_survey_df[['Paid_Job', 'ConvertedSalary']].head())
Edit and Run Code