Cleaning and counting
Remove stop words to explore the content of just the airline tweets classified as complaints in twitter_data.
This exercise is part of the course
Introduction to Text Analysis in R
Exercise instructions
- Tokenize the tweets in 
twitter_data. Name the column with tokenized words asword. - Remove the default stop words from the tokenized 
twitter_data. - Filter to keep the complaints only.
 - Compute word counts using the tokenized, cleaned text and arrange in descending order by count.
 
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
tidy_twitter <- twitter_data %>% 
  # Tokenize the twitter data
  ___(___, ___) %>% 
  # Remove stop words
  anti_join(stop_words)
tidy_twitter %>% 
  # Filter to keep complaints only
  ___(___ == ___) %>% 
  # Compute word counts and arrange in descending order
  ___(___) %>% 
  ___(___)