1. Random-effects in regressions
So far, we've look at slopes and intercepts. But, what if a variable is nested within another distribution?
2. Nested students
For example, our school data could be illustrated by this figure. We have students nested within classrooms and classrooms nested within schools.
3. Nested variables
More generically, a nested relationship is hierarchical in nature and it creates a multi-level model. Mathematically, this is called a mapping from one distribution to another. Practically, it gives us a way to pool or share information across replicates. Thus, outliers from groups with small sample sizes have less impact if we treat them as a random-effect.
4. Algebraic representation
This equation depicts a simple random effect. The equation on top is the relationship to the data given the i-th beta. The random effect assumes that beta is drawn from a normal distribution with the mean, mu, and the standard deviation, sigma. This is the algebraic representation of the multi-level model.
5. R syntax
In R, we will use the lme4 package. This package is commonly used and is newer than the l-m-e-r, also pronounced "lemur", package. Both the lme4 and lmer packages share some of the same authors. Others packages exist because the numerical methods behind mixed-effect models are an open research question. With the lme4 notation, we need to specify a random effect. To specify a random intercept, we use parentheses with a one-pipe-random effect group. On US keyboards, the pipe key is above the enter key on the right side of the keyboard. To specify a random slope, we use parentheses slope - pipe - random effect group.
6. Random-effect models with school data
So far, we've seen the theory behind multi-level models. Next, you will get to apply multi-level models to school data.
In the school dataset, we are specifically interested in examining if different factors affect how students learn math. First, we examine if a students' sex impacts their learning. Second, we examine if a teachers' training or math knowledge impacts learning. Last, we will plot the parameter estimates using ggplot2.
7. Let's practice!
Now, it's your turn to answer these questions with the data!