Optimize n_neighbors
Now that we have scaled data, we can try using a KNN model. To maximize performance, we should tune our model's hyperparameters. For the k-nearest neighbors algorithm, we only have one hyperparameter: n
, the number of neighbors. We set this hyperparameter when we create the model with KNeighborsRegressor
. The argument for the number of neighbors is n_neighbors
.
We want to try a range of values that passes through the setting with the best performance. Usually we will start with 2 neighbors, and increase until our scoring metric starts to decrease. We'll use the R\(^2\) value from the .score()
method on the test set (scaled_test_features
and test_targets
) to optimize n
here. We'll use the test set scores to determine the best n
.
Cet exercice fait partie du cours
Machine Learning for Finance in Python
Instructions
- Loop through values of 2 to 12 for
n
and set this asn_neighbors
in theknn
model. - Fit the model to the training data (
scaled_train_features
andtrain_targets
). - Print out the R\(^2\) values using the
.score()
method of theknn
model for the train and test sets, and take note of the best score on the test set.
Exercice interactif pratique
Essayez cet exercice en complétant cet exemple de code.
from sklearn.neighbors import KNeighborsRegressor
for n in range(____):
# Create and fit the KNN model
knn = KNeighborsRegressor(n_neighbors=____)
# Fit the model to the training data
knn.fit(____, ____)
# Print number of neighbors and the score to find the best value of n
print("n_neighbors =", n)
print('train, test scores')
print(knn.score(scaled_train_features, train_targets))
print(knn.score(____, ____))
print() # prints a blank line