Create day-of-week features

We can engineer datetime features to add even more information for our non-linear models. Most financial data has datetimes, which have lots of information in them -- year, month, day, and sometimes hour, minute, and second. But we can also get the day of the week, and things like the quarter of the year, or the elapsed time since some event (e.g. earnings reports).

We are only going to get the day of the week here, since our dataset doesn't go back very far in time. The dayofweek property from the pandas datetime index will help us get the day of the week. Then we will dummy dayofweek with pandas' get_dummies(). This creates columns for each day of the week with binary values (0 or 1). We drop the first column because it can be inferred from the others.

This exercise is part of the course

Machine Learning for Finance in Python

View Course

Exercise instructions

  • Use the dayofweek property from the lng_df index to get the days of the week.
  • Use the get_dummies function on the days of the week variable, giving it a prefix of 'weekday'.
  • Set the index of the days_of_week variable to be the same as the lng_df index so we can merge the two.
  • Concatenate the lng_df and days_of_week DataFrames into one DataFrame.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Use pandas' get_dummies function to get dummies for day of the week
days_of_week = pd.get_dummies(lng_df.index.____,
                              prefix=____,
                              drop_first=True)

# Set the index as the original dataframe index for merging
days_of_week.index = lng_df.____

# Join the dataframe with the days of week dataframe
lng_df = pd.concat([lng_df, ____], axis=1)

# Add days of week to feature names
feature_names.extend(['weekday_' + str(i) for i in range(1, 5)])
lng_df.dropna(inplace=True)  # drop missing values in-place
print(lng_df.head())