Extracting mentions
In each sublist of the dataset of tweets, there is an element called "mentions_screen_name" (i.e. Twitter handles). This element contains either NULL if there was no mention in the tweet, or one or more screen names mentioned in the tweet. A way to detect a popular account from a list of tweets is to detect who are the most mentioned users in a specific tweet collection.
We'll first extract a vector of all mentions, and once we've got this new vector, we'll count the number of time each profile is mentioned. To do that, we'll build a new composed function, by combining table() (which counts the number of occurrences of each element in the vector), sort(), and tail().
purrr has been loaded for you, and rstudioconf is available in your dataset.
This exercise is part of the course
Intermediate Functional Programming with purrr
Exercise instructions
Build a function that is the combination of
as_vector(),compact(), andflatten().Create a function that takes two arguments:
listandwhat. This function will runmap( list, what ), and pass the result toflatten_to_vector.Create
six_most, a function that combinestail(),sort(), andtable().Run
extractor()onrstudioconf, and pass the result tosix_most().
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Combine as_vector(), compact(), and flatten()
flatten_to_vector <- ___(___, ___, ___)
# Complete the function
extractor <- function(list, what = "mentions_screen_name"){
map( ___ , ___ ) %>%
___()
}
# Create six_most, with tail(), sort(), and table()
six_most <- ___(___, ___, ___)
# Run extractor() on rstudioconf
___(rstudioconf) %>%
___()