ComeçarComece de graça

Histograms

The data set loan_data is loaded in your workspace. You previously explored categorical variables using the CrossTable() function. Now you would like to explore continuous variables to identify potential outliers or unexpected data structures.

To do this, let's experiment with the function hist() to understand the distribution of the number of loans for different customers.

Este exercício faz parte do curso

Credit Risk Modeling in R

Ver curso

Instruções do exercício

  • Use hist() to create a histogram with only one argument: loan_data$loan_amnt. Assign the result to a new object called hist_1.
  • Use $breaks along with the object hist_1 to get more information on the histogram breaks. Knowing the location of the breaks is important because if they are poorly chosen, the histogram may be misleading.
  • Change the number of breaks in hist_1 to 200 by specifying the breaks argument. Additionally, name the x-axis "Loan amount" using the xlab argument and title it "Histogram of the loan amount" using the main argument. Save the result to hist_2. Why do the peaks occur where they occur?

Exercício interativo prático

Experimente este exercício completando este código de exemplo.

# Create histogram of loan_amnt: hist_1


# Print locations of the breaks in hist_1


# Change number of breaks and add labels: hist_2
hist_2 <- hist(loan_data$loan_amnt, breaks = ___, xlab = "___", 
               main = "___")
Editar e executar o código