Create features and targets

We almost have features and targets that are machine-learning ready -- we have features from current price changes (5d_close_pct) and indicators (moving averages and RSI), and we created targets of future price changes (5d_close_future_pct). Now we need to break these up into separate numpy arrays so we can feed them into machine learning algorithms.

Our indicators also cause us to have missing values at the beginning of the DataFrame due to the calculations. We could backfill this data, fill it with a single value, or drop the rows. Dropping the rows is a good choice, so our machine learning algorithms aren't confused by any sort of backfilled or 0-filled data. Pandas has a .dropna() function which we will use to drop any rows with missing values.

Drop the missing values from lng_df with .dropna() from pandas.
Create a variable containing our targets, which are the '5d_close_future_pct' values.
Create a DataFrame containing both targets (5d_close_future_pct) and features (contained in the existing list feature_names) so we can check the correlations.

Preparing data and a linear model

Machine learning tree methods

Neural networks and KNN

Machine learning with modern portfolio theory

Exercise

Create features and targets

Instructions