Histograms

The data set loan_data is loaded in your workspace. You previously explored categorical variables using the CrossTable() function. Now you would like to explore continuous variables to identify potential outliers or unexpected data structures.

To do this, let's experiment with the function hist() to understand the distribution of the number of loans for different customers.

Cet exercice fait partie du cours

Credit Risk Modeling in R

Afficher le cours

Instructions

Use hist() to create a histogram with only one argument: loan_data$loan_amnt. Assign the result to a new object called hist_1.
Use $breaks along with the object hist_1 to get more information on the histogram breaks. Knowing the location of the breaks is important because if they are poorly chosen, the histogram may be misleading.
Change the number of breaks in hist_1 to 200 by specifying the breaks argument. Additionally, name the x-axis "Loan amount" using the xlab argument and title it "Histogram of the loan amount" using the main argument. Save the result to hist_2. Why do the peaks occur where they occur?

Exercice interactif pratique

Essayez cet exercice en complétant cet exemple de code.

# Create histogram of loan_amnt: hist_1


# Print locations of the breaks in hist_1


# Change number of breaks and add labels: hist_2
hist_2 <- hist(loan_data$loan_amnt, breaks = ___, xlab = "___", 
               main = "___")

Modifier et exécuter le code