Feature extraction & analysis: amzn_cons
You now decide to contrast this with the amzn_cons_corp corpus in another bigram TDM. Of course, you expect to see some different phrases in your word cloud.
Once again, you will use this custom function to extract your bigram features for the visual:
tokenizer <- function(x)
NGramTokenizer(x, Weka_control(min = 2, max = 2))
Diese Übung ist Teil des Kurses
Text Mining with Bag-of-Words in R
Anleitung zur Übung
- Create
amzn_c_tdmby convertingamzn_cons_corpinto aTermDocumentMatrixand incorporating the bigram functioncontrol = list(tokenize = tokenizer). - Create
amzn_c_tdm_mas a matrix version ofamzn_c_tdm. - Create
amzn_c_freqby usingrowSums()to get term frequencies fromamzn_c_tdm_m. - Create a
wordcloud()usingnames(amzn_c_freq)and the valuesamzn_c_freq. Use the argumentsmax.words = 25andcolor = "red"as well.
Interaktive Übung
Vervollständige den Beispielcode, um diese Übung erfolgreich abzuschließen.
# Create amzn_c_tdm
___ <- ___(
___,
___
)
# Create amzn_c_tdm_m
___ <- ___
# Create amzn_c_freq
___ <- ___
# Plot a word cloud of negative Amazon bigrams
___