Get startedGet started for free

Optimize n_neighbors

Now that we have scaled data, we can try using a KNN model. To maximize performance, we should tune our model's hyperparameters. For the k-nearest neighbors algorithm, we only have one hyperparameter: n, the number of neighbors. We set this hyperparameter when we create the model with KNeighborsRegressor. The argument for the number of neighbors is n_neighbors.

We want to try a range of values that passes through the setting with the best performance. Usually we will start with 2 neighbors, and increase until our scoring metric starts to decrease. We'll use the R\(^2\) value from the .score() method on the test set (scaled_test_features and test_targets) to optimize n here. We'll use the test set scores to determine the best n.

This exercise is part of the course

Machine Learning for Finance in Python

View Course

Exercise instructions

  • Loop through values of 2 to 12 for n and set this as n_neighbors in the knn model.
  • Fit the model to the training data (scaled_train_features and train_targets).
  • Print out the R\(^2\) values using the .score() method of the knn model for the train and test sets, and take note of the best score on the test set.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

from sklearn.neighbors import KNeighborsRegressor

for n in range(____):
    # Create and fit the KNN model
    knn = KNeighborsRegressor(n_neighbors=____)
    
    # Fit the model to the training data
    knn.fit(____, ____)
    
    # Print number of neighbors and the score to find the best value of n
    print("n_neighbors =", n)
    print('train, test scores')
    print(knn.score(scaled_train_features, train_targets))
    print(knn.score(____, ____))
    print()  # prints a blank line
Edit and Run Code