Get startedGet started for free

Create features and targets

We almost have features and targets that are machine-learning ready -- we have features from current price changes (5d_close_pct) and indicators (moving averages and RSI), and we created targets of future price changes (5d_close_future_pct). Now we need to break these up into separate numpy arrays so we can feed them into machine learning algorithms.

Our indicators also cause us to have missing values at the beginning of the DataFrame due to the calculations. We could backfill this data, fill it with a single value, or drop the rows. Dropping the rows is a good choice, so our machine learning algorithms aren't confused by any sort of backfilled or 0-filled data. Pandas has a .dropna() function which we will use to drop any rows with missing values.

This exercise is part of the course

Machine Learning for Finance in Python

View Course

Exercise instructions

  • Drop the missing values from lng_df with .dropna() from pandas.
  • Create a variable containing our targets, which are the '5d_close_future_pct' values.
  • Create a DataFrame containing both targets (5d_close_future_pct) and features (contained in the existing list feature_names) so we can check the correlations.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Drop all na values
lng_df = lng_df.____

# Create features and targets
# use feature_names for features; '5d_close_future_pct' for targets
features = lng_df[feature_names]
targets = lng_df[____]

# Create DataFrame from target column and feature columns
feature_and_target_cols = ['5d_close_future_pct'] + ____
feat_targ_df = lng_df[feature_and_target_cols]

# Calculate correlation matrix
corr = feat_targ_df.corr()
print(corr)
Edit and Run Code