1. Learn
  2. /
  3. Courses
  4. /
  5. Sentiment Analysis in R

Exercise

The wonderful wizard of NRC

Last but not least, you get to work with the NRC lexicon which labels words across multiple emotional states. Remember Plutchik's wheel of emotion? The NRC lexicon tags words according to Plutchik's 8 emotions plus positive/negative.

In this exercise there is a new operator, %in%, which matches a vector to another. In the code below %in% will return FALSE, FALSE, TRUE. This is because within some_vec, 1 and 2 are not found within some_other_vector but 3 is found and returns TRUE. The %in% is useful to find matches.

some_vec <- c(1, 2, 3)
some_other_vector <- c(3, "a", "b")
some_vec %in% some_other_vector

Another new operator is !. For logical conditions, adding ! will inverse the result. In the above example, the FALSE, FALSE, TRUE will become TRUE, TRUE, FALSE. Using it in concert with %in% will inverse the response and is good for removing items that are matched.

!some_vec %in% some_other_vector

We've created oz which is the tidy version of The Wizard of Oz along with nrc containing the "NRC" lexicon with renamed columns.

Instructions 1/2

undefined XP
    1
    2
  • Inner join oz to the nrc lexicon.
    • Call inner_join() to join the tibbles.
    • Join by the term column in the text and the word column in the lexicon.
  • Filter to only Pluchik's emotions and drop the positive or negative words in the lexicon.
    • Use filter() to keep rows where the sentiment is not "positive" or "negative".
  • Group by sentiment.
    • Call group_by(), passing sentiment without quotes.
  • Get the total count of each sentiment.
    • Call summarize(), setting total_count equal to the sum() of count.
    • Assign the result to oz_plutchik.