Two sample mean test statistic
The hypothesis test for determining if there is a difference between the means of two populations uses a different type of test statistic to the z-scores you saw in Chapter one. It's called "t", and can be calculated from three values from each sample using this equation.
$$ t = \dfrac{(\bar{x}_{\text{child}} - \bar{x}_{\text{adult}})}{\sqrt{\dfrac{s_{\text{child}}^2}{n_{\text{child}}} + \dfrac{s_{\text{adult}}^2}{n_{\text{adult}}}}} $$
While trying to determine why some shipments are late, you may wonder if the weight of the shipments that were late is different from the weight of the shipments that were on time. The late_shipments
dataset has been split into a "yes" group, where late == "Yes"
and a "no" group where late == "No"
. The weight of the shipment is given in the weight_kilograms
variable.
For convenience, the sample means for the two groups are available as xbar_no
and xbar_yes
. The sample standard deviations are s_no
and s_yes
. The sample sizes are n_no
and n_yes
.
This exercise is part of the course
Hypothesis Testing in R
Exercise instructions
- Calculate the numerator of the test statistic.
- Calculate the denominator of the test statistic.
- Use those two numbers to calculate the test statistic.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Calculate the numerator of the test statistic
numerator <- ___
# Calculate the denominator of the test statistic
denominator <- ___
# Calculate the test statistic
t_stat <- ___
# See the result
t_stat