Get startedGet started for free

Preparing data for market basket analysis

Throughout this course, you will typically encounter data in one of two formats: a pandas DataFrame or a list of lists. DataFrame objects will be constructed by importing a csv file using pandas. They will consist of a single column of data, where each element contains a string of items in a transaction, separated by a comma, as in the table below.

In this exercise, you will practice loading the data from a csv file and will prepare it for use as a list of lists. Note that the path to the grocery store dataset has been defined and is available to you as groceries_path.

Transaction
'milk,bread,biscuit'
'bread,milk,biscuit,cereal'
'tea,milk,coffee,cereal'

This exercise is part of the course

Market Basket Analysis in Python

View Course

Exercise instructions

  • Import the pandas package under the alias pd.
  • Use pandas to read the csv file at the path specified by groceries_path.
  • Select the Transaction column from the DataFrame and split each string of comma-separated items into a list.
  • Convert the DataFrame of transactions into a list of lists.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Import pandas under the alias pd
import ____ as pd

# Load transactions from pandas
groceries = pd.____(groceries_path)

# Split transaction strings into lists
transactions = groceries['____'].apply(lambda t: t.split(','))

# Convert DataFrame column into list of strings
transactions = list(____)

# Print the list of transactions
print(transactions)
Edit and Run Code