Get Started

Bucketing a numeric variable into a factor

Your old friend Dan sent you a list of 50 AAA rated bonds called AAA_rank, with each bond having an additional number from 1-100 describing how profitable he thinks that bond will be (100 being the most profitable). You are interested in doing further analysis on his suggestions, but first it would be nice if the bonds were bucketed by their ranking somehow. This would help you create groups of bonds, from least profitable to most profitable, to more easily analyze them.

This is a great example of creating a factor from a numeric vector. The easiest way to do this is to use cut(). Below, Dan's 1-100 ranking is bucketed into 5 evenly spaced groups. Note that the ( in the factor levels means we do not include the number beside it in that group, and the ] means that we do include that number in the group.

head(AAA_rank)

[1]  31  48 100  53  85  73

AAA_factor <- cut(x = AAA_rank, breaks = c(0, 20, 40, 60, 80, 100))

head(AAA_factor)

[1] (20,40]  (40,60]  (80,100] (40,60]  (80,100] (60,80] 
Levels: (0,20] (20,40] (40,60] (60,80] (80,100]

In the cut() function, using breaks = allows you to specify the groups that you want R to bucket your data by!

This is a part of the course

“Introduction to R for Finance”

View Course

Exercise instructions

  • Instead of 5 buckets, can you create just 4? In breaks = use a vector from 0 to 100 where each element is 25 numbers apart. Assign it to AAA_factor.
  • The 4 buckets do not have very descriptive names. Use levels() to rename the levels to "low", "medium", "high", and "very_high", in that order.
  • Print the newly named AAA_factor.
  • Plot the AAA_factor to visualize your work!

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Create 4 buckets for AAA_rank using cut()
AAA_factor <- cut(x = ___, breaks = ___)

# Rename the levels 


# Print AAA_factor


# Plot AAA_factor
plot(___)

This exercise is part of the course

Introduction to R for Finance

BeginnerSkill Level
4.8+
12 reviews

Learn essential data structures such as lists and data frames and apply that knowledge directly to financial examples.

Questions with answers that fall into a limited number of categories can be classified as factors. In this chapter, you will use bond credit ratings to learn all about creating, ordering, and subsetting factors.

Exercise 1: What is a factor?Exercise 2: Create a factorExercise 3: Factor levelsExercise 4: Factor summaryExercise 5: Visualize your factorExercise 6: Bucketing a numeric variable into a factor
Exercise 7: Ordering and subsetting factorsExercise 8: Create an ordered factorExercise 9: Subsetting a factor

What is DataCamp?

Learn the data skills you need online at your own pace—from non-coding essentials to data science and machine learning.

Start Learning for Free