Create day-of-week features
We can engineer datetime features to add even more information for our non-linear models. Most financial data has datetimes, which have lots of information in them -- year, month, day, and sometimes hour, minute, and second. But we can also get the day of the week, and things like the quarter of the year, or the elapsed time since some event (e.g. earnings reports).
We are only going to get the day of the week here, since our dataset doesn't go back very far in time. The dayofweek
property from the pandas datetime index will help us get the day of the week. Then we will dummy dayofweek
with pandas' get_dummies()
. This creates columns for each day of the week with binary values (0 or 1). We drop the first column because it can be inferred from the others.
This exercise is part of the course
Machine Learning for Finance in Python
Exercise instructions
- Use the
dayofweek
property from thelng_df
index to get the days of the week. - Use the
get_dummies
function on the days of the week variable, giving it a prefix of'weekday'
. - Set the index of the
days_of_week
variable to be the same as thelng_df
index so we can merge the two. - Concatenate the
lng_df
anddays_of_week
DataFrames into one DataFrame.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Use pandas' get_dummies function to get dummies for day of the week
days_of_week = pd.get_dummies(lng_df.index.____,
prefix=____,
drop_first=True)
# Set the index as the original dataframe index for merging
days_of_week.index = lng_df.____
# Join the dataframe with the days of week dataframe
lng_df = pd.concat([lng_df, ____], axis=1)
# Add days of week to feature names
feature_names.extend(['weekday_' + str(i) for i in range(1, 5)])
lng_df.dropna(inplace=True) # drop missing values in-place
print(lng_df.head())