Kernel density plot
Now that you learned about a kernel density plot you can create one! Remember it's like a smoothed histogram but isn't affected by binwidth. This exercise will help you construct a kernel density plot from sentiment values.
In this exercise you will plot 2 kernel densities. One for Agamemnon and another for The Wizard of Oz. For both you will perform an inner_join()
with the "afinn" lexicon. Recall the "afinn" lexicon has terms scored from -5 to 5. Once in a tidy format, both books will retain words and corresponding scores for the lexicon.
After that, you need to row bind the results into a larger data frame using bind_rows()
and create a plot with ggplot2
.
From the visual you will be able to understand which book uses more positive versus negative language. There is clearly overlap as negative things happen to Dorothy but you could infer the kernel density is demonstrating a greater probability of positive language in the Wizard of Oz compared to Agamemnon.
We've loaded ag
and oz
as tidy versions of Agamemnon and The Wizard of Oz respectively, and created afinn
as a subset of the tidytext
"afinn"
lexicon.
This exercise is part of the course
Sentiment Analysis in R
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
ag_afinn <- ag %>%
# Inner join to afinn lexicon
___(___, by = c("term" = "word"))
oz_afinn <- oz %>%
# Inner join to afinn lexicon
___
# Combine
all_df <- ___(agamemnon = ___, oz = ___, .id = "___")