1. Learn
  2. /
  3. Courses
  4. /
  5. Inference for Categorical Data in R

Exercise

A p-value, two ways

You've seen before how there are usually two ways to get to the null distribution: through computation and through a mathematical approximation. The chi-squared goodness of fit test is no exception. The approximation distribution is again the "Chi-squared distribution" with degrees of freedom equal to the number of categories minus one.

In this exercise you'll compare these two approaches to calculate a p-value that measures the consistency of the distribution of the Iran first digits with Benford's Law. Note that the observed statistic that you created in the last exercise is saved in your work space as chi_obs_stat.

Instructions 1/2

undefined XP
    1
    2
  • Compute the degrees of freedom of the chi-squared approximation by taking the first_digit vector from the iran data then calculating the number of categories using the nlevels() function then subtracting one.
  • Using null, plot the distribution of the chi-squared statistics using a density plot. Add a vertical line indicating the observed statistic, then overlay the curve of the chi-squared approximation with the degrees of freedom in blue.