T for test and A for Anova

Although the informal graphical material presented up to now has all indicated a lack of difference in the two treatment groups, most investigators would still require a formal test for a difference. Consequently we shall now apply a t-test to assess any difference between the treatment groups, and also calculate a confidence interval for this difference. We use the data without the outlier created in the previous exercise. The t-test confirms the lack of any evidence for a group difference. Also the 95% confidence interval is wide and includes the zero, allowing for similar conclusions to be made.

Baseline measurements of the outcome variable in a longitudinal study are often correlated with the chosen summary measure and using such measures in the analysis can often lead to substantial gains in precision when used appropriately as a covariate in an analysis of covariance. We can illustrate the analysis on the data using the BPRS value corresponding to time zero taken prior to the start of treatment as the baseline covariate. We see that the baseline BPRS is strongly related to the BPRS values taken after treatment has begun, but there is still no evidence of a treatment difference even after conditioning on the baseline value.

This exercise is part of the course

Helsinki Open Data Science

View Course

Exercise instructions

Perform a two-sample t-test and observe the differences as seen in in the boxplots of the previous exercise
Add the baseline from the original data as a new variable to the summary data
Fit the linear model with mean as the target and baseline + treatment as the response from the BPRSL8S1 (Remember the lm() formula y ~ x1 + x2)
Compute the analysis of variance table for the fitted model and pay close attention to the significance of baseline

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# dplyr, tidyr & ggplot2 packages and BPRSL8S & BPRSL8S1 data are available

# Perform a two-sample t-test
t.test(mean ~ treatment, data = BPRSL8S1, var.equal = TRUE)

# Add the baseline from the original data as a new variable to the summary data
BPRSL8S2 <- BPRSL8S %>%
  mutate(baseline = BPRS$week0)

# Fit the linear model with the mean as the response 
fit <- lm("Linear model formula here!", data = BPRSL8S2)

# Compute the analysis of variance table for the fitted model with anova()

Edit and Run Code