Bing tidy polarity: Count & pivot the white whale
In this exercise you will apply another inner_join()
using the "bing"
lexicon.
Then you will manipulate the results with both count()
from dplyr
and pivot_wider()
from tidyr
to learn about the text.
The pivot_wider()
function spreads data across multiple columns. In this case the sentiment and corresponding n
values represent the frequency of positive or negative terms for each line. Using pivot_wider()
changes the data so that each row now has positive and negative values, even if it is 0.
This exercise is part of the course
Sentiment Analysis in R
Exercise instructions
In this exercise, your R session has m_dick_tidy
which contains the book Moby Dick and bing
, containing the lexicon similar to the previous exercise.
- Perform an
inner_join()
onm_dick_tidy
andbing
.- As before, join the
"term"
column inm_dick_tidy
to the"word"
column in the lexicon. - Call the new object
moby_lex_words
.
- As before, join the
- Create a column
index
, equal toas.numeric()
applied todocument
. This occurs withinmutate()
in the tidyverse. - Create
moby_count
by forwardingmoby_lex_words
tocount()
, passing insentiment, index
. - Generate
moby_wide
by pipingmoby_count
topivot_wider()
wherenames_from
equals thesentiment
column,values_from
equals then
column and values are filled in withvalues_fill = 0
. arrange
is the next pipe used to order the rows byindex
values
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Inner join
moby_lex_words <- inner_join(___, ___, by = c("___" = "___"))
moby_lex_words <- moby_lex_words %>%
# Set index to numeric document
mutate(___ = as.numeric(___))
moby_count <- moby_lex_words %>%
# Count by sentiment, index
___(___, ___)
# Examine the counts
moby_count
moby_wide <- moby_count %>%
# Pivot the sentiments
pivot_wider(names_from = ___, values_from = ___, values_fill = ___) %>%
arrange(index)
# Review the pivoted data
moby_wide