Two-sample t-test
A two-sample t-test is used to test if the means of two populations equal.
The examples of analyses that quantify the impact of a factor include testing a pharmaceutical drug on patients or a marketing campaign on demand.
Recall that few assumptions need to be met to carry out a two-sample t-test:
- Random samples
- Independent observations
- Normally distributed underlying data
- Homogeneity of variances
The former two assumptions need to be met at the stage of designing the experiment. The latter two assumptions can be tested using the Shapiro-Wilk test and Bartlett's test respectively.
A company provided you with the df
data frame. The sample
column indicates the sample, and the value
column contains numerical data. The dplyr
package is available in your environment.
This exercise is part of the course
Practicing Statistics Interview Questions in R
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Return the first part of df
head(___)
# Test normality of sample 1
sample1 <- df %>% filter(sample == ___) %>% select(value) %>% pull()
shapiro.test(___)
# Test normality of sample 2
sample2 <- df %>% filter(sample == 2) %>% select(___) %>% pull()
shapiro.test(___)