Evaluate performance

Lastly, and as always, we want to evaluate performance of our best model to check how well or poorly we are doing. Ideally it's best to do back-testing, but that's an involved process we don't have room to cover in this course.

We've already seen the R\(^2\) scores, but let's take a look at the scatter plot of predictions vs actual results using matplotlib. Perfect predictions would be a diagonal line from the lower left to the upper right.

This exercise is part of the course

Machine Learning for Finance in Python

View Course

Exercise instructions

  • Use the best number for max_features in our RandomForestRegressor (rfr) that we found in the previous exercise (it was 4).
  • Make predictions using the model with the train_features and test_features.
  • Scatter actual targets (train/test_targets) vs the predictions (train/test_predictions), and label the datasets train and test.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Use the best hyperparameters from before to fit a random forest model
rfr = RandomForestRegressor(n_estimators=200, max_depth=3, max_features=____, random_state=42)
rfr.fit(train_features, train_targets)

# Make predictions with our model
train_predictions = rfr.predict(____)
test_predictions = ____

# Create a scatter plot with train and test actual vs predictions
plt.scatter(train_targets, train_predictions, label='train')
plt.scatter(____)
plt.legend()
plt.show()