BaşlayınÜcretsiz Başlayın

Feature extraction & analysis: amzn_cons

You now decide to contrast this with the amzn_cons_corp corpus in another bigram TDM. Of course, you expect to see some different phrases in your word cloud.

Once again, you will use this custom function to extract your bigram features for the visual:

tokenizer <- function(x) 
  NGramTokenizer(x, Weka_control(min = 2, max = 2))

Bu egzersiz

Text Mining with Bag-of-Words in R

kursunun bir parçasıdır
Kursu Görüntüle

Egzersiz talimatları

  • Create amzn_c_tdm by converting amzn_cons_corp into a TermDocumentMatrix and incorporating the bigram function control = list(tokenize = tokenizer).
  • Create amzn_c_tdm_m as a matrix version of amzn_c_tdm.
  • Create amzn_c_freq by using rowSums() to get term frequencies from amzn_c_tdm_m.
  • Create a wordcloud() using names(amzn_c_freq) and the values amzn_c_freq. Use the arguments max.words = 25 and color = "red" as well.

Uygulamalı interaktif egzersiz

Bu örnek kodu tamamlayarak bu egzersizi bitirin.

# Create amzn_c_tdm
___ <- ___(
  ___,
  ___
)

# Create amzn_c_tdm_m
___ <- ___

# Create amzn_c_freq
___ <- ___

# Plot a word cloud of negative Amazon bigrams
___
Kodu Düzenle ve Çalıştır