Aan de slagGa gratis aan de slag

Binarizing columns

While numeric values can often be used without any feature engineering, there will be cases when some form of manipulation can be useful. For example on some occasions, you might not care about the magnitude of a value but only care about its direction, or if it exists at all. In these situations, you will want to binarize a column. In the so_survey_df data, you have a large number of survey respondents that are working voluntarily (without pay). You will create a new column titled Paid_Job indicating whether each person is paid (their salary is greater than zero).

Deze oefening maakt deel uit van de cursus

Feature Engineering for Machine Learning in Python

Cursus bekijken

Oefeninstructies

  • Create a new column called Paid_Job filled with zeros.
  • Replace all the Paid_Job values with a 1 where the corresponding ConvertedSalary is greater than 0.

Praktische interactieve oefening

Probeer deze oefening eens door deze voorbeeldcode in te vullen.

# Create the Paid_Job column filled with zeros
so_survey_df[____] = ____

# Replace all the Paid_Job values where ConvertedSalary is > 0
so_survey_df.____[____, 'Paid_Job'] = 1

# Print the first five rows of the columns
print(so_survey_df[['Paid_Job', 'ConvertedSalary']].head())
Code bewerken en uitvoeren