1. Two-sample proportion tests
The previous example tested a single proportion against a specific value. As with means, we can also test differences between proportions in two populations.
2. Comparing two proportions
The Stack Overflow survey contains a hobbyist variable. The value "Yes" means the user described themself as a hobbyist; "No" means they described themself as a professional.
We can hypothesize that the proportion of users who are hobbyists is the same for the under thirty age category as the at least thirty category.
More formally, the null hypothesis is that the difference between the population parameters for each group is zero.
I've set a significance level of point-zero-five.
3. Calculating the z-score
Let's break down this z-score equation.
The sample statistic is the difference in the proportions for each category. That's the two p-hat values on the numerator.
We subtract the hypothesized value of the population parameter. Assuming the null hypothesis is true, that's just zero, so the term disappears.
The denominator is the standard error of the sample statistic.
Again we can avoid having to generate a bootstrap distribution. The equation for the standard error is a slightly more complicated version of the equation for the one sample case.
In this equation, note that p-hat is the sample proportion for the whole dataset, not for each category. This whole dataset p-hat is known as a pooled estimate of the population proportion.
We need one more equation to get p-hat. It's a weighted mean of the p-hats for each category.
This looks horrendous, but it's just arithmetic, and R is pretty good at that.
The good news is that we only need four numbers from the sample dataset to do these calculations.
4. Getting the numbers for the z-score
To calculate these four numbers you group_by the categories, and summarize to calculate the sample proportion and row counts.
After that, you can do the arithmetic to get the test statistic - the z-score is minus four-point-two - and call pnorm to get the p-value. I'm not going to run through these steps because it's exactly the same as you've seen already.
Instead, I'll show you an easier way.
5. Proportion tests using prop_test()
Naturally, R has at least two functions for performing this type of hypothesis test, known as a proportion test. Base-R has a function named prop-dot-test, though unfortunately, its interface is somewhat peculiar. To make your life easier it's best to use prop-underscore-test from infer.
This takes a formula, with your numeric variable on the left and the categories on the right. The order argument specifies which p-hat should be subtracted from the other. Here, it is specified so that you start with the "at least thirty" p-hat, then subtract the "under thirty" p-hat.
The success argument specifies which of the two response levels to get the proportion of. alternative takes the same values as t-test. Here, we have a two-sided alternative hypothesis.
The correct argument specifies whether or not to apply Yates' continuity correction. This is a fudge factor needed for technical reasons when the sample size is very small. Since each group has over one thousand observations, we don't need it.
The p-value is in the third column of the resulting tibble. It's smaller than the significance level we specified earlier, so we can conclude that there is a difference in the proportion of Stack Overflow hobbyists between the under and over thirty groups.
6. Let's practice!
Time for some proportion tests.