Exercise

Exploring text vectors, part 2

Using the return_weights() function you wrote in the previous exercise, you're now going to extract the top words from each document in the text vector, return a list of the word indices, and use that list to filter the text vector down to those top words.

Instructions

100 XP
  • Call return_weights() to return the top weighted words for that document.
  • Call set() on the returned filter_list to remove duplicated numbers.
  • Call words_to_filter, passing in the following parameters: vocab for the vocab parameter, tfidf_vec.vocabulary_ for the original_vocab parameter, text_tfidf for the vector parameter, and 3 to grab the top_n 3 weighted words from each document.
  • Finally, pass that filtered_words set into a list to use as a filter for the text vector.