Performing aggregation
After completing minor consulting jobs for a library and an ebook seller, you've finally received your first big market basket analysis project: advising an online novelty gifts retailer on cross-promotions. Since the retailer has never previously hired a data scientist, it would like you to start the project by exploring its transaction data. It has asked you to perform aggregation for all signs
in the dataset and also compute the support for this category. Note that pandas
has been imported for you as pd
. Additionally, the data has been imported in one-hot encoded format as onehot
.
This exercise is part of the course
Market Basket Analysis in Python
Exercise instructions
- Select the subset of the DataFrame's columns that contain the string
sign
. - Print the support for
signs
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Select the column headers for sign items
sign_headers = [i for i in onehot.columns if i.lower().find('sign')>=0]
# Select columns of sign items using sign_headers
sign_columns = onehot[____]
# Perform aggregation of sign items into sign category
signs = sign_columns.sum(axis = 1) >= 1.0
# Print support for signs
print('Share of Signs: %.2f' % ____.mean())