Bucketing a numeric variable into a factor
Your old friend Dan sent you a list of 50 AAA rated bonds called AAA_rank
, with each bond having an additional number from 1-100 describing how profitable he thinks that bond will be (100 being the most profitable). You are interested in doing further analysis on his suggestions, but first it would be nice if the bonds were bucketed by their ranking somehow. This would help you create groups of bonds, from least profitable to most profitable, to more easily analyze them.
This is a great example of creating a factor from a numeric vector. The easiest way to do this is to use cut()
. Below, Dan's 1-100 ranking is bucketed into 5 evenly spaced groups. Note that the (
in the factor levels means we do not include the number beside it in that group, and the ]
means that we do include that number in the group.
head(AAA_rank)
[1] 31 48 100 53 85 73
AAA_factor <- cut(x = AAA_rank, breaks = c(0, 20, 40, 60, 80, 100))
head(AAA_factor)
[1] (20,40] (40,60] (80,100] (40,60] (80,100] (60,80]
Levels: (0,20] (20,40] (40,60] (60,80] (80,100]
In the cut()
function, using breaks =
allows you to specify the groups that you want R to bucket your data by!
Este exercício faz parte do curso
Introduction to R for Finance
Instruções do exercício
- Instead of 5 buckets, can you create just 4? In
breaks =
use a vector from 0 to 100 where each element is 25 numbers apart. Assign it toAAA_factor
. - The 4 buckets do not have very descriptive names. Use
levels()
to rename the levels to"low"
,"medium"
,"high"
, and"very_high"
, in that order. - Print the newly named
AAA_factor
. - Plot the
AAA_factor
to visualize your work!
Exercício interativo prático
Experimente este exercício completando este código de exemplo.
# Create 4 buckets for AAA_rank using cut()
AAA_factor <- cut(x = ___, breaks = ___)
# Rename the levels
# Print AAA_factor
# Plot AAA_factor
plot(___)