Session Ready
Exercise

Splitting the dataset

In a previous exercise, you have determined that the mean number of retweets by tweet is 3.3. In this exercise, we'll have a look at how many tweets are above this mean, and how many are below.

To do so, we will first create a mapper that tests if .x is above 3.3. We'll then prefill map_at(), with .at being "retweet_count", and .f being first the mapper we've created, and in a second time the negation of this mapper.

Note that since this course was created, purrr behavior changed, and in order to avoid an argument clash between .f in partial() and .f in map_at(), you must use the quasi-quotation equals operator, :=, (sometimes known as the "walrus operator"). For the purpose of this exercise, all you need to know is that := works like =, but lets partial() know that the argument should be passed to map_at() rather than being kept for itself.

Once these tools are created, we will use them on the non_rt object, which is an extraction of the "original tweets" from the rstudioconf dataset.

purrr has been loaded for you.

Instructions
100 XP
  • Create mean_above, a mapper that tests if .x is above 3.3.

  • Prefill two version of map_at(): one with "retweet_count" & mean_above, and the other with "retweet_count" & the negation of mean_above.

  • Map these two prefilled functions on non_rt, and keep only the "retweet_count" elements.

  • Get the size of the two results.