1. Learn
  2. /
  3. Courses
  4. /
  5. Natural Language Processing (NLP) in Python

Connected

Exercise

Frequency analysis of product reviews

You now have access to a larger dataset of TechZone product reviews. Just like before, you've preprocessed and transformed the reviews into a BoW representation X. Your task now is to analyze the word frequencies and identify the most common terms in the dataset.

To help with the analysis, a helper function called get_top_ten() is provided. It takes in a list of words and their corresponding counts, and returns the 10 most frequent words and their counts.

Instructions 1/2

undefined XP
    1
    2
  • Derive word_counts, the total count for each word across all reviews.
  • Retrieve the list of unique words learned by the vectorizer.