T for test and A for Anova
Although the informal graphical material presented up to now has all indicated a lack of difference in the two treatment groups, most investigators would still require a formal test for a difference. Consequently we shall now apply a t-test to assess any difference between the treatment groups, and also calculate a confidence interval for this difference. We use the data without the outlier created in the previous exercise. The t-test confirms the lack of any evidence for a group difference. Also the 95% confidence interval is wide and includes the zero, allowing for similar conclusions to be made.
Baseline measurements of the outcome variable in a longitudinal study are often correlated with the chosen summary measure and using such measures in the analysis can often lead to substantial gains in precision when used appropriately as a covariate in an analysis of covariance. We can illustrate the analysis on the data using the BPRS value corresponding to time zero taken prior to the start of treatment as the baseline covariate. We see that the baseline BPRS is strongly related to the BPRS values taken after treatment has begun, but there is still no evidence of a treatment difference even after conditioning on the baseline value.
This exercise is part of the course
Helsinki Open Data Science
Exercise instructions
- Perform a two-sample t-test and observe the differences as seen in in the boxplots of the previous exercise
- Add the baseline from the original data as a new variable to the summary data
- Fit the linear model with meanas the target andbaseline+treatmentas the response from theBPRSL8S1(Remember thelm()formulay~x1+x2)
- Compute the analysis of variance table for the fitted model and pay close attention to the significance of baseline
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# dplyr, tidyr & ggplot2 packages and BPRSL8S & BPRSL8S1 data are available
# Perform a two-sample t-test
t.test(mean ~ treatment, data = BPRSL8S1, var.equal = TRUE)
# Add the baseline from the original data as a new variable to the summary data
BPRSL8S2 <- BPRSL8S %>%
  mutate(baseline = BPRS$week0)
# Fit the linear model with the mean as the response 
fit <- lm("Linear model formula here!", data = BPRSL8S2)
# Compute the analysis of variance table for the fitted model with anova()