Prior distributions

1. Prior distributions

Now that we've talked about how to control the size of our posterior distribution samples, we can talk about one of the components of the posterior distribution that we've only mentioned in passing so far: prior distributions.

2. What's a prior distribution?

Prior distributions reflect our prior beliefs about the values of parameters. This information gets combined with the likelihood of the data to create the posterior distribution.

3. Visualizing prior distributions

Here is an example of how prior distributions can affect the resulting posterior distribution. The likelihood of the of the data, indicated by the dashed purple line, stays the same across all three panels. Notice how, as the teal line showing the priors gets more narrow or informative, the posterior distribution begins to shift closer to the prior, and away from the likelihood. In general, priors get more informative when their distributions are narrow, or we have less data. We can think about the prior like an additional data point. If we have a sample of five, a sixth data point can be really influential. If we have a sample of 5000, one extra data point won't have much of an effect. Because priors have the potential to be influential, it is almost always a good idea to use non-informative or weakly informative priors, unless you have a good reason for believe that your parameters come from the informative distribution specified by the prior.

4. Prior distributions in rstanarm

We can view the prior distributions for each of our parameters in an rstanarm model by using the `prior_summary` function. By default the intercept gets a normally distributed prior with a mean of 0 an standard deviation of 10, and other coefficients get a normally distributed prior with a mean of 0 and standard deviation of 2.5. Auxiliary is the error standard deviation. This uses an exponentially distributed prior with a rate of 1. However, notice that there are also adjusted scales listed. This is because rstanarm recognizes that these defaults may not be appropriate for every dataset. Therefore they adjust the variance based on your data. For example, here, the standard deviation for the prior of the intercept was 204.11.

5. Calculating adjusted scales

The adjusted scale for the intercept is calculated as 10 times the standard deviation of your dependent variable. We use 10, because this is the default scale used by rstanarm for intercept. For predictors, the scale is calculated as 2.5 divided by the standard deviation of your predictor times the standard deviation of the dependent variable. Just like with the intercept, we use 2.5 because this is the default scale used for predictors. Look again at the priors for the intercept and predictor in the children's IQ model. By taking 10 times the standard deviation of kid_score, our dependent variable, we get the adjusted scale of 204.11. Similarly, by taking 2.5 divided by the standard deviation of the mom's IQ times the standard deviation of the the children's IQ, we get that adjusted scale of 3.40.

6. Unadjusted Priors

rstanarm uses automatically adjusted the priors in order to ensure the the specified priors are not too informative. However, we can turn off this adjustment if we want to. To do this, we can specify autoscale as false for the intercept prior (which is prior_int), coefficients (which is just prior), the error (which is prior_aux) or any combination. After specifying autoscale equals false, we can see that there are no longer adjusted scales in the prior_summary output.

7. Let's practice!

Now it's your turn to explore priors in rstanarm.

Create Your Free Account

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.