LoslegenKostenlos loslegen

One-hot encoding specific columns

A local used car dealership wants your help in predicting the sale price of their vehicles. If you use one-hot encoding on the entire used_cars dataset, the new dataset has over 1,200 columns. You are worried that this might lead to problems when training your machine learning models to predict price. You have decided to try a simpler approach and only use one-hot encoding on a few columns.

Diese Übung ist Teil des Kurses

Working with Categorical Data in Python

Kurs anzeigen

Anleitung zur Übung

  • Create a new dataset, used_cars_simple, with one-hot encoding for these columns: "manufacturer_name" and "transmission" (in this order).
  • Set the prefix of all new columns to "dummy", so that you can easily filter to newly created columns.

Interaktive Übung

Vervollständige den Beispielcode, um diese Übung erfolgreich abzuschließen.

# Create one-hot encoding for just two columns
used_cars_simple = pd.____(
  used_cars,
  # Specify the columns from the instructions
  ____,
  # Set the prefix
  ____
)

# Print the shape of the new dataset
print(used_cars_simple.shape)
Code bearbeiten und ausführen