Bing tidy polarity: Simple example
Now that you understand the basics of an inner join, let's apply this to the "Bing" lexicon. Keep in mind the inner_join()
function comes from dplyr
and the lexicon object is obtained using tidytext
's get_sentiments()
function'.
The Bing lexicon labels words as positive or negative. The next three exercises let you interact with this specific lexicon. To use get_sentiments()
pass in a string such as "afinn", "bing", "nrc", or "loughran" to download the specific lexicon.
The inner join workflow:
- Obtain the correct lexicon using
get_sentiments()
. - Pass the lexicon and the tidy text data to
inner_join()
. - In order for
inner_join()
to work there must be a shared column name. If there are no shared column names, declare them with an additional parameter,by
equal toc
with column names like below.
object <- x %>%
inner_join(y, by = c("column_from_x" = "column_from_y"))
- Perform some aggregation and analysis on the table intersection.
This exercise is part of the course
Sentiment Analysis in R
Exercise instructions
We've loaded ag_txt
containing the first 100 lines from Agamemnon and ag_tidy
which is the tidy version.
- For comparison, use
polarity()
onag_txt
. - Get the
"bing"
lexicon by passing that string toget_sentiments()
. - Perform an
inner_join()
withag_tidy
andbing
.- The word columns are called
"term"
inag_tidy
&"word"
in the lexicon, so declare theby
argument. - Call the new object
ag_bing_words
.
- The word columns are called
- Print
ag_bing_words
, and look at some of the words that are in the result. - Pass
ag_bing_words
tocount()
ofsentiment
using the pipe operator, %>%. Compare thepolarity()
score to sentiment count ratio.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Qdap polarity
___
# Get Bing lexicon
bing <- get_sentiments("___")
# Join text to lexicon
ag_bing_words <- ___(___, ___, by = c("___" = "___"))
# Examine
ag_bing_words
# Get counts by sentiment
ag_bing_words %>%
___(___)