Session Ready
Exercise

Visualize dissimilar words

Say you want to visualize the words not in common. To do this, you can also use comparison.cloud(), and the steps are quite similar with one main difference.

Like when you were searching for words in common, you start by unifying the tweets into distinct corpora and combining them into their own VCorpus() object. Next apply a clean_corpus() function and organize it into a TermDocumentMatrix.

To keep track of what words belong to coffee versus chardonnay, you can set the column names of the TDM like this:

colnames(all_tdm) <- c("chardonnay", "coffee")

Lastly, convert the object to a matrix using as.matrix() for use in comparison.cloud(). For every distinct corpora passed to the comparison.cloud() you can specify a color, as in colors = c("red", "yellow", "green"), to make the sections distinguishable.

Instructions
100 XP

all_corpus is preloaded in your workspace.

  • Create all_clean by applying the predefined clean_corpus function to all_corpus.
  • Create all_tdm, a TermDocumentMatrix, from all_clean.
  • Use colnames() to rename each distinct corpora within all_tdm. Name the first column "coffee" and the second column "chardonnay".
  • Create all_m by converting all_tdm into matrix form.
  • Create a comparison.cloud() using all_m, with colors = c("orange", "blue") and max.words = 50.