Preprocessing text
Building a recommendation system, or any model, requires text to be preprocessed first.
A block of text from Sherlock Holmes is loaded here. Preprocess this text using the various techniques presented in the video to prepare it for further analysis.
The text
variable is an excerpt from The Hound of the Baskervilles by Arthur Conan Doyle.
The following packages and functions have been loaded for you:
nltk
, torch
, get_tokenizer
, PorterStemmer
, stopwords
.
Diese Übung ist Teil des Kurses
Deep Learning for Text with PyTorch
Interaktive Übung
Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.
# Initialize and tokenize the text
tokenizer = ____("basic_english")
tokens = ____(____)
print(tokens)