Divide & conquer: Using polarity for a comparison cloud
Now that you have seen how polarity can be used to divide a corpus, let's do it! This code will walk you through dividing a corpus based on sentiment so you can peer into the information in subsets instead of holistically.
Your R session has oz_pol
which was created by applying polarity()
to "The Wonderful Wizard of Oz."
For simplicity's sake, we created a simple custom function called pol_subsections()
which will divide the corpus by polarity score. First, the function accepts a data frame with each row being a sentence or document of the corpus. The data frame is subset anywhere the polarity values are greater than or less than 0. Finally, the positive and negative sentences, non-zero polarities, are pasted with parameter collapse
so that the terms are grouped into a single corpus. Lastly, the two documents are concatenated into a single vector of two distinct documents.
pol_subsections <- function(df) {
x.pos <- subset(df$text, df$polarity > 0)
x.neg <- subset(df$text, df$polarity < 0)
x.pos <- paste(x.pos, collapse = " ")
x.neg <- paste(x.neg, collapse = " ")
all.terms <- c(x.pos, x.neg)
return(all.terms)
}
At this point you have omitted the neutral sentences and want to focus on organizing the remaining text. In this exercise we use the %>%
operator again to forward objects to functions. After some simple cleaning use comparison.cloud()
to make the visual.
This exercise is part of the course
Sentiment Analysis in R
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
oz_df <- oz_pol$all %>%
# Select text.var as text and polarity
___(text = ___, polarity = ___)
# Apply custom function pol_subsections()
all_terms <- ___(___)
all_corpus <- all_terms %>%
# Source from a vector
___() %>%
# Make a volatile corpus
___()