Grouped summaries
So there are more non-complaints than complaints in twitter_data. You might be starting to question whether or not this data is actually from Twitter! There are a few other columns of interest in twitter_data that would be helpful to explore before you get to the tweets themselves. Every tweet includes the number of followers that user has in the usr_followers_count column. Do you expect those who complain to have more users or fewer users, on average, than those who don't complain? You can use grouped summaries to quickly and easily provide an answer.
This exercise is part of the course
Introduction to Text Analysis in R
Exercise instructions
- Group the data by 
complaint_label. - Compute the average, minimum, and maximum number of 
usr_followers_count. 
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Start with the data frame
___ %>% 
  # Group the data by whether or not the tweet is a complaint
  ___(___) %>% 
  # Compute the mean, min, and max follower counts
  summarize(
    avg_followers = ___(___),
    min_followers = ___(___),
    max_followers = ___(___)
  )