1. Learn
  2. /
  3. Courses
  4. /
  5. Inference for Categorical Data in R

Exercise

When the null is true: decision

In the last exercise, the observed difference in proportions is comfortably in the middle of the null distribution. In this exercise, you'll come to a formal decision on if you should reject the null hypothesis, but instead of using p-values, you'll use the notion of a rejection region.

The rejection region is the range of values of the statistic that would lead you to reject the null hypothesis. In a two-tailed test, there are two rejection regions. You know that the upper region should contain the largest 2.5% of the null statistics (when alpha = .05), so you can extract the cutoff value by finding the .975 quantile(). Similarly, the lower region contains the smallest 2.5% of the null statistics, which can also be found using quantile().

Here's a quick look at how the quantile() function works for this simple dataset x.

x <- c(0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20)
quantile(x, probs = .5)
quantile(x, probs = .8)

Once you have the rejection region defined by the upper and lower cutoffs, you can make your decision regarding the null by checking if your observed statistic falls between those cutoffs (in which case you will fail to reject) or outside of them (in which case you will reject).

Instructions 1/2

undefined XP
    1
    2
  • Create an object called alpha that takes the value 0.05.
  • Find the upper cutoff by starting with the null data frame, which has been carried over from the last exercise, and summarizing the stat column by finding the alpha / 2 quantile(). Save this value as lower. Next, find the 1 - alpha / 2 quantile() and save it to upper.
  • Check if your observed value of d_hat is between() the lower and upper cutoffs to find whether you should fail to reject the null hypothesis.