Get startedGet started for free

Assessing author effort

Often authors will use more words when they are more passionate. For example, a mad airline passenger will leave a longer review the worse (the perceived) service. Conversely a less impassioned passenger may not feel compelled to spend a lot of time writing a review. Lengthy reviews may inflate overall sentiment since the reviews will inherently contain more positive or negative language as the review lengthens. This coding exercise helps to examine effort and sentiment.

In this exercise you will visualize the relationship between effort and sentiment. Recall your rental review tibble contains an id and that a word is represented in each row. As a result a simple count() of the id will capture the number of words used in each review. Then you will join this summary to the positive and negative data. Ultimately you will create a scatter plot that will visualize author review length and its relationship to polarity.

This exercise is part of the course

Sentiment Analysis in R

View Course

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Review tidy_reviews and pos_neg
tidy_reviews
pos_neg

pos_neg_pol <- tidy_reviews %>% 
  # Effort is measured as count by id
  ___(___) %>% 
  # Inner join to pos_neg
  ___(___) %>% 
  # Add polarity status
  ___(pol = ___(___, "___", "___"))

# Examine results
pos_neg_pol
Edit and Run Code