Exercise

Balanced bucketing

The Quantity variable in the online_retail dataset has a very skewed distribution. That is, most individuals buy 1 to 5 items, but a small number buy close to 50. How can we better capture this type of distribution using buckets?

online_retail %>% 
    select(quant_cat) %>%
    table()

(1,6]  (6,11] (11,16] (16,21] (21,26] (26,31] (31,36] (36,41] (41,46] 
38915    6362   10646    1099    4744     295     896     208      24 

Instructions

100 XP
  • Break the Quantity variable into three buckets that are balanced in terms of how many individuals are represented in each bucket.
  • Use table() to look at the break-up by the number of individuals.