CommencerCommencer gratuitement

Cleaning and counting

Remove stop words to explore the content of just the airline tweets classified as complaints in twitter_data.

Cet exercice fait partie du cours

Introduction to Text Analysis in R

Afficher le cours

Instructions

  • Tokenize the tweets in twitter_data. Name the column with tokenized words as word.
  • Remove the default stop words from the tokenized twitter_data.
  • Filter to keep the complaints only.
  • Compute word counts using the tokenized, cleaned text and arrange in descending order by count.

Exercice interactif pratique

Essayez cet exercice en complétant cet exemple de code.

tidy_twitter <- twitter_data %>% 
  # Tokenize the twitter data
  ___(___, ___) %>% 
  # Remove stop words
  anti_join(stop_words)

tidy_twitter %>% 
  # Filter to keep complaints only
  ___(___ == ___) %>% 
  # Compute word counts and arrange in descending order
  ___(___) %>% 
  ___(___)
Modifier et exécuter le code