Create features and targets

We almost have features and targets that are machine-learning ready -- we have features from current price changes (5d_close_pct) and indicators (moving averages and RSI), and we created targets of future price changes (5d_close_future_pct). Now we need to break these up into separate numpy arrays so we can feed them into machine learning algorithms.

Our indicators also cause us to have missing values at the beginning of the DataFrame due to the calculations. We could backfill this data, fill it with a single value, or drop the rows. Dropping the rows is a good choice, so our machine learning algorithms aren't confused by any sort of backfilled or 0-filled data. Pandas has a .dropna() function which we will use to drop any rows with missing values.

Este ejercicio forma parte del curso

Machine Learning for Finance in Python

Ver curso

Instrucciones del ejercicio

Drop the missing values from lng_df with .dropna() from pandas.
Create a variable containing our targets, which are the '5d_close_future_pct' values.
Create a DataFrame containing both targets (5d_close_future_pct) and features (contained in the existing list feature_names) so we can check the correlations.

Ejercicio interactivo práctico

Prueba este ejercicio y completa el código de muestra.

# Drop all na values
lng_df = lng_df.____

# Create features and targets
# use feature_names for features; '5d_close_future_pct' for targets
features = lng_df[feature_names]
targets = lng_df[____]

# Create DataFrame from target column and feature columns
feature_and_target_cols = ['5d_close_future_pct'] + ____
feat_targ_df = lng_df[feature_and_target_cols]

# Calculate correlation matrix
corr = feat_targ_df.corr()
print(corr)

Editar y ejecutar código