Frequent terms with qdap
If you are OK giving up some control over the exact preprocessing steps, then a fast way to get frequent terms is with freq_terms()
from qdap
.
The function accepts a text variable, which, in our case, is the tweets$text
vector. You can specify the top number of terms to show with the top
argument, a vector of stop words to remove with the stopwords
argument, and the minimum character length of a word to be included with the at.least
argument. qdap
has its own list of stop words that differ from those in tm
. Our exercise will show you how to use either and compare their results.
Making a basic plot of the results is easy. Just call plot()
on the freq_terms()
object.
This exercise is part of the course
Text Mining with Bag-of-Words in R
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Create frequency
frequency <- ___(
___,
top = ___,
at.least = ___,
stopwords = ___
)
# Make a frequency bar chart