1. Learn
  2. /
  3. Courses
  4. /
  5. Preprocessing for Machine Learning in Python

Exercise

Modeling the UFO dataset, part 2

Finally, you'll build a model using the text vector we created, desc_tfidf, using the filtered_words list to create a filtered text vector. Let's see if you can predict the type of the sighting based on the text. You'll use a Naive Bayes model for this.

Instructions

100 XP
  • Filter the desc_tfidf vector by passing a list of filtered_words into the index.
  • Split the filtered_text features and y, ensuring an equal class distribution in the training and test sets; use a random_state of 42.
  • Use the nb model's .fit() to fit X_train and y_train.
  • Print out the .score() of the nb model on the X_test and y_test sets.