Remapping categories
To better understand survey respondents from airlines
, you want to find out if there is a relationship between certain responses and the day of the week and wait time at the gate.
The airlines
DataFrame contains the day
and wait_min
columns, which are categorical and numerical respectively. The day
column contains the exact day a flight took place, and wait_min
contains the amount of minutes it took travelers to wait at the gate. To make your analysis easier, you want to create two new categorical variables:
wait_type
:'short'
for 0-60 min,'medium'
for 60-180 andlong
for 180+day_week
:'weekday'
if day is in the weekday,'weekend'
if day is in the weekend.
The pandas
and numpy
packages have been imported as pd
and np
. Let's create some new categorical data!
This exercise is part of the course
Cleaning Data in Python
Exercise instructions
- Create the ranges and labels for the
wait_type
column mentioned in the description. - Create the
wait_type
column by fromwait_min
by usingpd.cut()
, while inputtinglabel_ranges
andlabel_names
in the correct arguments. - Create the
mapping
dictionary mapping weekdays to'weekday'
and weekend days to'weekend'
. - Create the
day_week
column by using.replace()
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Create ranges for categories
label_ranges = [0, 60, ____, np.inf]
label_names = ['short', ____, ____]
# Create wait_type column
airlines['wait_type'] = pd.____(____, bins = ____,
labels = ____)
# Create mappings and replace
mappings = {'Monday':'weekday', 'Tuesday':'____', 'Wednesday': '____',
'Thursday': '____', '____': '____',
'Saturday': 'weekend', '____': '____'}
airlines['day_week'] = airlines['day'].____(mappings)