BaşlayınÜcretsiz Başlayın

Encoding categorical columns III: DictVectorizer

Alright, one final trick before you dive into pipelines. The two step process you just went through - LabelEncoder followed by OneHotEncoder - can be simplified by using a DictVectorizer.

Using a DictVectorizer on a DataFrame that has been converted to a dictionary allows you to get label encoding as well as one-hot encoding in one go.

Your task is to work through this strategy in this exercise!

Bu egzersiz

Extreme Gradient Boosting with XGBoost

kursunun bir parçasıdır
Kursu Görüntüle

Egzersiz talimatları

  • Import DictVectorizer from sklearn.feature_extraction.
  • Convert df into a dictionary called df_dict using its .to_dict() method with "records" as the argument.
  • Instantiate a DictVectorizer object called dv with the keyword argument sparse=False.
  • Apply the DictVectorizer on df_dict by using its .fit_transform() method.
  • Hit 'Submit Answer' to print the resulting first five rows and the vocabulary.

Uygulamalı interaktif egzersiz

Bu örnek kodu tamamlayarak bu egzersizi bitirin.

# Import DictVectorizer
____

# Convert df into a dictionary: df_dict
df_dict = ____

# Create the DictVectorizer object: dv
dv = ____

# Apply dv on df: df_encoded
df_encoded = ____

# Print the resulting first five rows
print(df_encoded[:5,:])

# Print the vocabulary
print(dv.vocabulary_)
Kodu Düzenle ve Çalıştır