CommencerCommencer gratuitement

Shakespearean language encoder

With the preprocessed Shakespearean text at your fingertips, you now need to encode it into a numerical representation. You will need to define the encoding steps before putting the pipeline together. To better handle large amounts of data and efficiently perform the encoding, you will use PyTorch's Dataset and DataLoader for batching and shuffling the data.

The following has been loaded for you: torch, nltk, stopwords, PorterStemmer, get_tokenizer, CountVectorizer, Dataset, DataLoader, and preprocess_sentences.

The processed_shakespeare from the Shakespearean text is also available to you.

Cet exercice fait partie du cours

Deep Learning for Text with PyTorch

Afficher le cours

Exercice interactif pratique

Essayez cet exercice en complétant cet exemple de code.

# Define your Dataset class
class ____(Dataset):
    def __init__(self, data):
        self.data = ____
    def __len__(self):
        return len(self.data)
    def __getitem__(self, idx):
        return self.____[____]
Modifier et exécuter le code