Get startedGet started for free

Ordinal encoding of a categorical column

Imputing categorical values involves a few additional steps over imputing numerical values. You need to first convert them to numerical values as statistical operations cannot be performed on strings.

You will use the user profile dataset which contains customer preferences and choices recorded by a restaurant. It contains only categorical features. In this exercise, you will convert the categorical column 'ambience' to a numerical one using OrdinalEncoder from sklearn. The DataFrame has been loaded for you as users. The function OrdinalEncoder() has also been loaded.

The head() and tail() of users DataFrame has been printed for you.

This exercise is part of the course

Dealing with Missing Data in Python

View Course

Exercise instructions

  • Create the ordinal encoder object and assign it to ambience_ord_enc.
  • Select the non-missing values of the 'ambience' column in users.
  • Reshape ambience_not_null to shape (-1, 1).
  • Replace the non-missing values of ambience with its encoded values.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Set col_name to 'ambience'
col_name = 'ambience'
# Create Ordinal encoder
ambience_ord_enc = ___

# Select non-null values of ambience column in users
ambience = users[col_name]
ambience_not_null = ___

# Reshape ambience_not_null to shape (-1, 1)
reshaped_vals = ___

# Select the non-null values for the column col_name in users and store the encoded values
encoded_vals = ambience_ord_enc.fit_transform(reshaped_vals)
users.loc[___, col_name] = np.squeeze(encoded_vals)
Edit and Run Code