Optimize n_neighbors

Now that we have scaled data, we can try using a KNN model. To maximize performance, we should tune our model's hyperparameters. For the k-nearest neighbors algorithm, we only have one hyperparameter: n, the number of neighbors. We set this hyperparameter when we create the model with KNeighborsRegressor. The argument for the number of neighbors is n_neighbors.

We want to try a range of values that passes through the setting with the best performance. Usually we will start with 2 neighbors, and increase until our scoring metric starts to decrease. We'll use the R\(^2\) value from the .score() method on the test set (scaled_test_features and test_targets) to optimize n here. We'll use the test set scores to determine the best n.

Cet exercice fait partie du cours

Machine Learning for Finance in Python

Afficher le cours

Instructions

Loop through values of 2 to 12 for n and set this as n_neighbors in the knn model.
Fit the model to the training data (scaled_train_features and train_targets).
Print out the R\(^2\) values using the .score() method of the knn model for the train and test sets, and take note of the best score on the test set.

Exercice interactif pratique

Essayez cet exercice en complétant cet exemple de code.

from sklearn.neighbors import KNeighborsRegressor

for n in range(____):
    # Create and fit the KNN model
    knn = KNeighborsRegressor(n_neighbors=____)
    
    # Fit the model to the training data
    knn.fit(____, ____)
    
    # Print number of neighbors and the score to find the best value of n
    print("n_neighbors =", n)
    print('train, test scores')
    print(knn.score(scaled_train_features, train_targets))
    print(knn.score(____, ____))
    print()  # prints a blank line

Modifier et exécuter le code