LoslegenKostenlos loslegen

One-hot encoding transaction data

Throughout the course, we will use a common pipeline for preprocessing data for use in market basket analysis. The first step is to import a pandas DataFrame and select the column that contains transactions. Each transaction in the column will be a string that consists of a number of items, each separated by a comma. The next step is to use a lambda function to split each transaction string into a list, thereby transforming the column into a list of lists.

In this exercise, you'll start with the list of lists from the grocery dataset, which is available to you as transactions. You will then transform transactions into a one-hot encoded DataFrame, where each column consists of TRUE and FALSE values that indicate whether an item was included in a transaction.

Diese Übung ist Teil des Kurses

Market Basket Analysis in Python

Kurs anzeigen

Anleitung zur Übung

  • From the mlxtend.preprocessing, import TransactionEncoder
  • Instantiate a transaction encoder and identify the unique items in transactions.
  • One-hot encode transactions in an array and assign its values to onehot.
  • Convert the array into a pandas DataFrame using the item names as column headers.

Interaktive Übung

Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.

# Import the transaction encoder function from mlxtend
from ____.____ import ____
import pandas as pd

# Instantiate transaction encoder and identify unique items in transactions
encoder = TransactionEncoder().____(____)

# One-hot encode transactions
onehot = encoder.____(transactions)

# Convert one-hot encoded data to DataFrame
onehot = pd.DataFrame(____, columns = encoder.columns_)

# Print the one-hot encoded transaction dataset
print(onehot)
Code bearbeiten und ausführen