Get Started

One-hot encoding transaction data

Throughout the course, we will use a common pipeline for preprocessing data for use in market basket analysis. The first step is to import a pandas DataFrame and select the column that contains transactions. Each transaction in the column will be a string that consists of a number of items, each separated by a comma. The next step is to use a lambda function to split each transaction string into a list, thereby transforming the column into a list of lists.

In this exercise, you'll start with the list of lists from the grocery dataset, which is available to you as transactions. You will then transform transactions into a one-hot encoded DataFrame, where each column consists of TRUE and FALSE values that indicate whether an item was included in a transaction.

This is a part of the course

“Market Basket Analysis in Python”

View Course

Exercise instructions

  • From the mlxtend.preprocessing, import TransactionEncoder
  • Instantiate a transaction encoder and identify the unique items in transactions.
  • One-hot encode transactions in an array and assign its values to onehot.
  • Convert the array into a pandas DataFrame using the item names as column headers.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Import the transaction encoder function from mlxtend
from ____.____ import ____
import pandas as pd

# Instantiate transaction encoder and identify unique items in transactions
encoder = TransactionEncoder().____(____)

# One-hot encode transactions
onehot = encoder.____(transactions)

# Convert one-hot encoded data to DataFrame
onehot = pd.DataFrame(____, columns = encoder.columns_)

# Print the one-hot encoded transaction dataset
print(onehot)

This exercise is part of the course

Market Basket Analysis in Python

IntermediateSkill Level
4.8+
5 reviews

Explore association rules in market basket analysis with Python by bookstore data and creating movie recommendations.

In this chapter, you’ll learn the basics of Market Basket Analysis: association rules, metrics, and pruning. You’ll then apply these concepts to help a small grocery store improve its promotional and product placement efforts.

Exercise 1: What is market basket analysis?Exercise 2: The basics of market basket analysisExercise 3: Cross-selling productsExercise 4: Identifying association rulesExercise 5: Multiple antecedents and consequentsExercise 6: Preparing data for market basket analysisExercise 7: Generating association rulesExercise 8: The simplest metricExercise 9: One-hot encoding transaction data
Exercise 10: Computing the support metric

What is DataCamp?

Learn the data skills you need online at your own pace—from non-coding essentials to data science and machine learning.

Start Learning for Free