Get startedGet started for free

Divide & conquer: Using polarity for a comparison cloud

Now that you have seen how polarity can be used to divide a corpus, let's do it! This code will walk you through dividing a corpus based on sentiment so you can peer into the information in subsets instead of holistically.

Your R session has oz_pol which was created by applying polarity() to "The Wonderful Wizard of Oz."

For simplicity's sake, we created a simple custom function called pol_subsections() which will divide the corpus by polarity score. First, the function accepts a data frame with each row being a sentence or document of the corpus. The data frame is subset anywhere the polarity values are greater than or less than 0. Finally, the positive and negative sentences, non-zero polarities, are pasted with parameter collapse so that the terms are grouped into a single corpus. Lastly, the two documents are concatenated into a single vector of two distinct documents.

pol_subsections <- function(df) {
  x.pos <- subset(df$text, df$polarity > 0)
  x.neg <- subset(df$text, df$polarity < 0)
  x.pos <- paste(x.pos, collapse = " ")
  x.neg <- paste(x.neg, collapse = " ")
  all.terms <- c(x.pos, x.neg)
  return(all.terms)
}

At this point you have omitted the neutral sentences and want to focus on organizing the remaining text. In this exercise we use the %>% operator again to forward objects to functions. After some simple cleaning use comparison.cloud() to make the visual.

This exercise is part of the course

Sentiment Analysis in R

View Course

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

oz_df <- oz_pol$all %>%
  # Select text.var as text and polarity
  ___(text = ___, polarity = ___)

# Apply custom function pol_subsections()
all_terms <- ___(___)

all_corpus <- all_terms %>%
  # Source from a vector
  ___() %>% 
  # Make a volatile corpus 
  ___()
Edit and Run Code