CommencerCommencer gratuitement

Defining an aggregation function

Surprised by the high share of sign items in its inventory, the retailer decides that it makes sense to do further aggregation for different categories to explore the data better. This seems trivial to you, but the retailer has not previously been able to perform even a basic descriptive analysis of its transaction and items.

The retailer asks you to perform aggregation for the candles, bags, and boxes categories. To simplify the task, you decide to write a function. It will take a string that contains an item's category. It will then output a DataFrame that indicates whether each transaction includes items from that category. Note that pandas has been imported for you as pd. Additionally, the data has been imported in one-hot encoded format as onehot.

Cet exercice fait partie du cours

Market Basket Analysis in Python

Afficher le cours

Instructions

  • Complete the list comprehension that extracts a subset of the column headers.
  • Select the columns for the item you wish to aggregate.
  • Perform aggregation using the function aggregate() for bags, boxes, and candles using the strings bag, box, and candle.

Exercice interactif pratique

Essayez cet exercice en complétant cet exemple de code.

def aggregate(item):
	# Select the column headers for sign items in onehot
	item_headers = [i for i in ____.columns if i.lower().find(item)>=0]

	# Select columns of sign items
	item_columns = onehot[____]

	# Return category of aggregated items
	return item_columns.sum(axis = 1) >= 1.0

# Aggregate items for the bags, boxes, and candles categories  
bags = aggregate('bag')
boxes = aggregate('____')
candles = ____
Modifier et exécuter le code