Session Ready
Exercise

Make the vector a VCorpus object (2)

Now that we've converted our vector to a Source object, we pass it to another tm function, VCorpus(), to create our volatile corpus. Pretty straightforward, right?

The VCorpus object is a nested list or list of lists. At each index of the VCorpus object, there is a PlainTextDocument object, which is a list containing actual text data (content), and some corresponding metadata (meta). It can help to visualize a VCorpus object to conceptualize the whole thing.

To review a single document object (the 10th), you subset with double square brackets.

coffee_corpus[[10]]

To review the actual text, you index the list twice. To access the document's metadata, like timestamp, change [1] to [2]. Another way to review the plain text is with the content() function, which doesn't need the second set of brackets.

coffee_corpus[[10]][1]

content(coffee_corpus[[10]])
Instructions
100 XP
  • Call the VCorpus() function on the coffee_source object to create coffee_corpus.
  • Verify coffee_corpus is a VCorpus object by printing it to the console.
  • Print the 15th element of coffee_corpus to the console to verify that it's a PlainTextDocument that contains the content and metadata of the 15th tweet. Use double bracket subsetting.
  • Print the content of the 15th tweet in coffee_corpus. Use double brackets to select the proper tweet, followed by single brackets to extract the content of that tweet.
  • Print the content() of the 10th tweet within coffee_corpus