Get startedGet started for free

Ordinal encoding of a DataFrame

Categorical features can be encoded using two techniques namely, one-hot encoding and ordinal encoding. In one-hot encoding, each category becomes a column and the respective category column for each row is 1 and the others 0. In ordinal encoding, the categories are mapped to integer values starting from 0 to number of categories.

In this exercise, you will loop over all the columns in the users DataFrame to ordinally encode the categories. You will also store an encoder for each column in a dictionary ordinal_enc_dict so that the encoded columns can be converted back to the original categories.

This exercise is part of the course

Dealing with Missing Data in Python

View Course

Exercise instructions

  • Define an empty dictionary ordinal_enc_dict.
  • Create an Ordinal Encoder object for each column.
  • Select non-null values of column in users and encode them.
  • Assign back the encoded values to non-null values of each column (col_name) in users.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Create an empty dictionary ordinal_enc_dict
ordinal_enc_dict = ___

for col_name in users:
    # Create Ordinal encoder for col
    ordinal_enc_dict[col_name] = ___
    col = users[col_name]
    
    # Select non-null values of col
    col_not_null = ___
    reshaped_vals = col_not_null.values.reshape(-1, 1)
    encoded_vals = ordinal_enc_dict[col_name].fit_transform(reshaped_vals)
    
    # Select the non-null values for the column col_name in users and store the encoded values
    users.loc[___, ___] = np.squeeze(encoded_vals)
Edit and Run Code