1. 学习
  2. /
  3. 课程
  4. /
  5. Deep Learning for Text with PyTorch

Connected

练习

Shakespearean language preprocessing pipeline

Over at PyBooks, the team wants to transform a vast library of Shakespearean text data for further analysis. The most efficient way to do this is with a text processing pipeline, starting with the preprocessing steps.

The following have been loaded for you: torch, nltk, stopwords, PorterStemmer, get_tokenizer.

The Shakespearean text data is saved as shakespeare and the sentences have already been extracted.

说明 1 / 共 3 个

undefined XP
    1
    2
    3
  • Create a list of unique English stopwords, saving them to stop_words.