Get startedGet started for free

Goodness of fit test

The null hypothesis in a goodness of fit test is a list of specific parameter values for each proportion. In your analysis, the equivalent hypothesis is that Benford's Law applies to the distribution of first digits of total vote counts at the city level. You could write this as:

$$ H_0: p_1 = .30, p_2 = .18, \ldots, p_9 = .05 $$

Where \(p_1\) is the height of the first bar in the Benford's bar plot. The alternate hypothesis is that at least one of these proportions is different; that the first digit distribution doesn't follow Benford's Law.

In this exercise, you'll use simulation to build up the null distribution of the sorts of chi-squared statistics that you'd observe if in fact these counts did follow Benford's Law.

This exercise is part of the course

Inference for Categorical Data in R

View Course

Exercise instructions

  • Inspect p_benford by printing it to the screen.
  • Starting with iran, compute the chi-squared statistic by using chisq_stat. Note that you must specify the variable in the data frame that will serve as your response as well as the vector of probabilities that you wish to compare them to.
  • Construct a null distribution with 500 samples of the Chisq statistic via simulation under the point null hypothesis that the vector of proportions p is p_benford. Save the resulting statistics as null.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Inspect p_benford
p_benford

# Compute observed stat
chi_obs_stat <- ___
  chisq_stat(response = ___, p = ___)

# Form null distribution
null <- ___
  # Specify the response
  ___
  # Set up the null hypothesis
  ___
  # Generate 500 reps
  ___
  # Calculate statistics
  ___
Edit and Run Code