1. Learn
  2. /
  3. Courses
  4. /
  5. Deep Learning for Text with PyTorch

Connected

Exercise

Word frequency analysis

Congratulations! You've just joined PyBooks. PyBooks is developing a book recommendation system and they want to find patterns and trends in text to improve their recommendations.

To begin, you'll want to understand the frequency of words in a given text and remove any rare words.

Note that typical real-world datasets will be larger than this example.

Instructions

100 XP
  • Import get_tokenizer from torchtext and FreqDist from the nltk library.
  • Initialize the tokenizer for English and tokenize the given text.
  • Calculate the frequency distribution of the tokens and remove rare words using list comprehension.