1. Learn
  2. /
  3. Courses
  4. /
  5. Preprocessing for Machine Learning in Python

Exercise

Modeling the UFO dataset, part 1

In this exercise, you're going to build a k-nearest neighbor model to predict which country the UFO sighting took place in. The X dataset contains the log-normalized seconds column, the one-hot encoded type columns, as well as the month and year when the sighting took place. The y labels are the encoded country column, where 1 is "us" and 0 is "ca".

Instructions

100 XP
  • Print out the .columns of the X set.
  • Split the X and y sets, ensuring that the class distribution of the labels is the same in the training and tests sets, and using a random_state of 42.
  • Fit knn to the training data.
  • Print the test set accuracy of the knn model.