1. Learn
  2. /
  3. Courses
  4. /
  5. Machine Learning with scikit-learn

Exercise

Overfitting and underfitting

Remember the model complexity curve that Hugo showed in the video? You will now construct such a curve for the digits dataset! In this exercise, you will compute and plot the training and testing accuracy scores for a variety of different neighbor values. By observing how the accuracy scores differ for the training and testing sets with different values of k, you will develop your intuition for overfitting and underfitting.

The training and testing sets are available to you in the workspace as X_train, X_test, y_train, y_test. In addition, KNeighborsClassifier has been imported from sklearn.neighbors.

Instructions

100 XP
  • Inside the for loop:
    • Setup a k-NN classifier with the number of neighbors equal to k.
    • Fit the classifier with k neighbors to the training data.
    • Compute accuracy scores the training set and test set separately using the .score() method and assign the results to the train_accuracy and test_accuracy arrays respectively.