ComenzarEmpieza gratis

Make the vector a VCorpus object (1)

Recall that you've loaded your text data as a vector called coffee_tweets in the last exercise. Your next step is to convert this vector containing the text data to a corpus. As you've learned in the video, a corpus is a collection of documents, but it's also important to know that in the tm domain, R recognizes it as a data type.

There are two kinds of the corpus data type, the permanent corpus, PCorpus, and the volatile corpus, VCorpus. In essence, the difference between the two has to do with how the collection of documents is stored on your computer. In this course, we will use the volatile corpus, which is held in your computer's RAM rather than saved to disk, just to be more memory efficient.

To make a volatile corpus, R needs to interpret each element in our vector of text, coffee_tweets, as a document. And the tm package provides what are called Source functions to do just that! In this exercise, we'll use a Source function called VectorSource() because our text data is contained in a vector. The output of this function is called a Source object. Give it a shot!

Este ejercicio forma parte del curso

Text Mining with Bag-of-Words in R

Ver curso

Instrucciones del ejercicio

  • Load the tm package.
  • Create a Source object from the coffee_tweets vector. Call this new object coffee_source.

Ejercicio interactivo práctico

Prueba este ejercicio completando el código de muestra.

# Load tm
___

# Make a vector source from coffee_tweets
___
Editar y ejecutar código