Get startedGet started for free

Remapping categories

To better understand survey respondents from airlines, you want to find out if there is a relationship between certain responses and the day of the week and wait time at the gate.

The airlines DataFrame contains the day and wait_min columns, which are categorical and numerical respectively. The day column contains the exact day a flight took place, and wait_min contains the amount of minutes it took travelers to wait at the gate. To make your analysis easier, you want to create two new categorical variables:

  • wait_type: 'short' for 0-60 min, 'medium' for 60-180 and long for 180+
  • day_week: 'weekday' if day is in the weekday, 'weekend' if day is in the weekend.

The pandas and numpy packages have been imported as pd and np. Let's create some new categorical data!

This exercise is part of the course

Cleaning Data in Python

View Course

Exercise instructions

  • Create the ranges and labels for the wait_type column mentioned in the description.
  • Create the wait_type column by from wait_min by using pd.cut(), while inputting label_ranges and label_names in the correct arguments.
  • Create the mapping dictionary mapping weekdays to 'weekday' and weekend days to 'weekend'.
  • Create the day_week column by using .replace().

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Create ranges for categories
label_ranges = [0, 60, ____, np.inf]
label_names = ['short', ____, ____]

# Create wait_type column
airlines['wait_type'] = pd.____(____, bins = ____, 
                                labels = ____)

# Create mappings and replace
mappings = {'Monday':'weekday', 'Tuesday':'____', 'Wednesday': '____', 
            'Thursday': '____', '____': '____', 
            'Saturday': 'weekend', '____': '____'}

airlines['day_week'] = airlines['day'].____(mappings)
Edit and Run Code