1. Learn
  2. /
  3. Courses
  4. /
  5. Text Mining with Bag-of-Words in R

Exercise

Find common words

Say you want to visualize common words across multiple documents. You can do this with commonality.cloud().

Each of our coffee and chardonnay corpora is composed of many individual tweets. To treat the coffee tweets as a single document and likewise for chardonnay, you paste() together all the tweets in each corpus along with the parameter collapse = " ". This collapses all tweets (separated by a space) into a single vector. Then you can create a single vector containing the two collapsed documents.

a_single_string <- paste(a_character_vector, collapse = " ")

Once you're done with these steps, you can take the same approach you've seen before to create a VCorpus() based on a VectorSource from the all_tweets object.

Instructions

100 XP
  • Create all_coffee by using paste() with collapse = " " on coffee_tweets$text.
  • Create all_chardonnay by using paste() with collapse = " " on chardonnay_tweets$text.
  • Create all_tweets using c() to combine all_coffee and all_chardonnay. Make all_coffee the first term.
  • Convert all_tweets using VectorSource().
  • Create all_corpus by using VCorpus() on all_tweets.