Perform an independent t-test
In this exercise, you'll manually perform an independent t-test the same way you did for the dependent t-test in the previous chapter. Continuing with the working memory example, our null hypothesis is that the difference in intelligence score gain between the group that trained for 8 days and the group that trained for 19 days is equal to zero. If our observed t-value is sufficiently large, we can reject the null in favor of the alternative hypothesis, which would imply a significant difference in intelligence gain between the two training groups.
Calculation of the observed t-value for an independent t-test is similar to the dependent t-test, but involves slightly different formulas. The t-value is now
$$t = \frac{(\bar{x_1} - \bar{x_2})}{se_p}$$
where \(\bar{x_1}\) and \(\bar{x_1}\) are the mean intelligence gains for group 1 and group 2, respectively. \(se_p\) is the pooled standard error, which is equivalent to
$$se_p = \sqrt{ \frac{var_1}{n_1} + \frac{var_2}{n_2} }$$
where \(var_1\) and \(var_2\) are the variances and \(n_1\) and \(n_2\) are the sample sizes of groups 1 and 2, respectively.
This exercise is part of the course
Intro to Statistics with R: Student's T-test
Exercise instructions
The subsets of data for both 8 and 19 days of training, wm_t08
and wm_t19
, are available in your workspace. Recall that the gain
column contains the gain in intelligence score before and after training for each subject.
- Compute the mean intelligence score gain for each of the two training groups and store the results in
mean_t08
andmean_t19
. Use themean()
function to do this. - Use the objects
mean_t08
andmean_t19
to find the difference in means. Subtract the lowest mean from the highest mean and store the result inmean_diff
. - Determine the number of participants in each sample using the code provided for you. Nothing to change here.
- Calculate the degrees of freedom for the relevant t-distribution. The formula for degrees of freedom in an independent t-test is $n1 + n2 - 2$. Use the sample sizes you created above and save the result to
df
. - Create
var_t08
andvar_t19
by applying thevar()
function to the intelligence gain for each of the two training groups. These objects represent the variance for each group. - Compute the pooled standard error
se_pooled
using the formula outlined above. You will need to use thesqrt()
function in addition to the other variables you've created in this exercise. Mind your brackets!
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# The subsets wm_t08 and wm_t19 are still loaded in the console
# Find the mean intelligence gain for both the 8 and 19 training day group
mean_t08 <- ___
mean_t19 <- ___
# Calculate mean difference by subtracting t08 by t19
mean_diff <- ___
# Determine the number of subjects in each sample
n_t08 <- nrow(wm_t08)
n_t19 <- nrow(wm_t19)
# Calculate degrees of freedom
df <- ___
# Calculate variance for each group
var_t08 <- ___
var_t19 <- ___
# Compute pooled standard error
se_pooled <- ___