Assessing author effort
Often authors will use more words when they are more passionate. For example, a mad airline passenger will leave a longer review the worse (the perceived) service. Conversely a less impassioned passenger may not feel compelled to spend a lot of time writing a review. Lengthy reviews may inflate overall sentiment since the reviews will inherently contain more positive or negative language as the review lengthens. This coding exercise helps to examine effort and sentiment.
In this exercise you will visualize the relationship between effort and sentiment. Recall your rental review tibble contains an id
and that a word is represented in each row. As a result a simple count()
of the id
will capture the number of words used in each review. Then you will join this summary to the positive and negative data. Ultimately you will create a scatter plot that will visualize author review length and its relationship to polarity.
This exercise is part of the course
Sentiment Analysis in R
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Review tidy_reviews and pos_neg
tidy_reviews
pos_neg
pos_neg_pol <- tidy_reviews %>%
# Effort is measured as count by id
___(___) %>%
# Inner join to pos_neg
___(___) %>%
# Add polarity status
___(pol = ___(___, "___", "___"))
# Examine results
pos_neg_pol