Get startedGet started for free

Factorial experimental design

1. Factorial experimental design

Now you will learn how to model studies with a factorial experimental design.

2. Factorial designs

A balanced factorial design includes samples that experience each combination of experimental variables. In the case of a 2x2 factorial design, there are 2 experimental variables each with 2 levels for a total of 4 experimental groups. In this video, I will analyze a 2x2 factorial experiment of low temperature in the plant species _Arabidopsis thaliana_. The study used two different types of Arabidopsis, also known as accessions: col, for Columbia, and vte2, a mutant deficient in vitamin E production. Three replicates of each of these two types of plant were exposed to normal and low temperatures. This is a subset of the data included in the study published by Maeda and colleagues. The data contains measurements for 11871 genes and 12 plants, 3 replicates for each combination of the two factors.

3. Group-means model for 2x2 factorial

Using the group-means parametrization, I want to create one coefficient for each of the 4 groups, which each includes samples for one of the two types of plants in one of the two temperatures.

4. Group-means design matrix for 2x2 factorial

Unlike in the previous study designs, I don't have a single variable that describes the 4 groups of samples, and therefore I need to create one from the existing variables in the phenotype data frame. To do this, I use the function `paste` to combine the variables `type` and `temp` into a single variable with a period separating the two values. For convenience, I use the function `with`, which allows me to easily refer to both of the columns of the phenotype data frame. I also convert this character vector to a factor. Recall that a factor records the unique levels of a variable, which we will use in the next step. To create a model without an intercept, I start the formula with zero and then pass my new group variable. To make the column names of the design matrix shorter and thus easier to use in the contrasts matrix, I replace the column names with the factor levels using the function `levels`. The resulting design matrix has 4 coefficients, each named for one of the combinations of plant type and temperature. And as expected, each column sums to 3, the number of samples modeled by each coefficient.

5. Contrasts for a 2x2 factorial

In this 2x2 factorial design, there are 5 contrasts I want to test. First, the differences between the 2 types of plant in normal temperature. Second, the differences between the 2 types of plant at low temperature. Third, the differences between temperatures for the vte2 type. Fourth, the difference between temperatures for the col type. Lastly, the differences of `temp` that differ between the two types of plants. This is traditionally referred to as an interaction effect. In other words, how does the response to temperature differ between the 2 types of plants?

6. Contrasts matrix for 2x2 factorial

I translate these 5 contrasts by referring to the column names of the design matrix. Note how the interaction coefficient is subtracting the two previous contrasts `temp_vte2` and `temp_col`. Viewing the contrasts matrix, you can see that the first 4 contrasts each compare only two coefficients, and then the final interaction contrast involves all 4.

7. Testing 2x2 factorial

Testing follows the now familiar pipeline. The two types of plants have similar function at normal temperature (see the contrast type_normal), but they diverge in behavior at low temperature (see the contrast type_low).

8. The effect of drought on Populus trees

In the following exercises, you will analyze a 2x2 factorial experiment. The researchers exposed two types of _Populus_ trees to normal and drought conditions. The data is a subset of the publication by Wilkins and colleagues. It contains measurements for 16,172 genes and 12 samples, 3 replicates for each combination.

9. Let's practice!

Now it's your turn.