Comparing two proportions
When we deal with categorical data, our parameter of interest is a sample proportion. An example of a sample proportion could be the proportion of people that are left wing voters. Imagine that we want to compare the proportion of males that are left wing voters to the proportion of females that are left wing voters. In this case our two groups are independent. To do the above analysis, we usually use a Z test.
Before we do a Z test however to see whether the difference between two proportions is meaningful, we usually calculate 2 things:
- The difference between the two sample proportions
- The standard error
The difference between the two sample proportions is easily calculated with the following formula: \(p1 - p2\). The formula for the standard error is a little bit trickier as it involves the calculation of pooled estimate \(\hat{p}\). You can calculate \(\hat{p}\) with the following formula: \(\hat{p} = (p1 + p2) / (n1 + n2)\). Here, \(p1\) stands for the number of successes in group 1, \(p2\) for the number of successes in group 2, \(n1\) for the sample size of group 1 and \(n2\) for the sample size of group 2.
This exercise is part of the course
Inferential Statistics
Exercise instructions
- In this exercise we have a sample of 100 males with a proportion of left wing voters of 0.6 and a sample of 150 females with a proportion of left-wing voters of 0.42. Calculate the difference between the male and the female sample proportion and store it in the variable
difference
- Calculate the pooled estimate \(\hat{p}\) and store it in a variable called pooled
pooled
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# calculate the difference in sample proportions and store it in a variable called difference
# calculate the pooled estimate and store it in a variable called pooled