Get startedGet started for free

Date features

You've built some basic features using numerical variables. Now, it's time to create features based on date and time. You will practice on a subsample from the Taxi Fare Prediction Kaggle competition data. The data represents information about the taxi rides and the goal is to predict the price for each ride.

Your objective is to generate date features from the pickup datetime. Recall that it's better to create new features for train and test data simultaneously. After the features are created, split the data back into the train and test DataFrames. Here it's done using pandas' isin() method.

The train and test DataFrames are already available in your workspace.

This exercise is part of the course

Winning a Kaggle Competition in Python

View Course

Exercise instructions

  • Concatenate the train and test DataFrames into a single DataFrame taxi.
  • Convert the "pickup_datetime" column to a datetime object.
  • Create the day of week (using .dayofweek attribute) and hour (using .hour attribute) features from the "pickup_datetime" column.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Concatenate train and test together
taxi = ____.____([train, test])

# Convert pickup date to datetime object
taxi['pickup_datetime'] = ____.____(taxi['pickup_datetime'])

# Create a day of week feature
taxi['dayofweek'] = taxi['pickup_datetime'].dt.____

# Create an hour feature
taxi['hour'] = taxi['pickup_datetime'].dt.____

# Split back into train and test
new_train = taxi[taxi['id'].isin(train['id'])]
new_test = taxi[taxi['id'].isin(test['id'])]
Edit and Run Code