Evaluate performance

Lastly, and as always, we want to evaluate performance of our best model to check how well or poorly we are doing. Ideally it's best to do back-testing, but that's an involved process we don't have room to cover in this course.

We've already seen the R\(^2\) scores, but let's take a look at the scatter plot of predictions vs actual results using matplotlib. Perfect predictions would be a diagonal line from the lower left to the upper right.

This exercise is part of the course

Machine Learning for Finance in Python

View Course

Exercise instructions

Use the best number for max_features in our RandomForestRegressor (rfr) that we found in the previous exercise (it was 4).
Make predictions using the model with the train_features and test_features.
Scatter actual targets (train/test_targets) vs the predictions (train/test_predictions), and label the datasets train and test.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Use the best hyperparameters from before to fit a random forest model
rfr = RandomForestRegressor(n_estimators=200, max_depth=3, max_features=____, random_state=42)
rfr.fit(train_features, train_targets)

# Make predictions with our model
train_predictions = rfr.predict(____)
test_predictions = ____

# Create a scatter plot with train and test actual vs predictions
plt.scatter(train_targets, train_predictions, label='train')
plt.scatter(____)
plt.legend()
plt.show()

Edit and Run Code