What's in a Bayesian Model?
1. What's in a Bayesian Model?
In the previous chapter we learned how to estimate a Bayesian linear regression. In this chapter we're going learn how these models can be modified. Unlike in a frequentist regression, there are several ways we can modify the estimation process. This is important because these factors will ultimately impact your results. We'll start by looking at the sampling of the posterior that creates the distribution summaries that we see in the output.2. Posterior distributions
As we briefly talked about last chapter, the posterior distribution is sampled in groups, called chains. Each sample from within the chain is called an iteration. The chains begin at a random locations. As it samples, the chain moves toward the area where the combination of likelihood and prior indicates a high probability of the true parameter value residing. The more iterations in a chain, the larger the samples of the posterior distribution will be. This means that the summaries are directly impacted by the length of the chain, as larger samples allow for more robust estimates of those summary statistics.3. Sampling the posterior distribution
Here is an example of this process. This is known as a trace plot. It shows the value of the parameter at each iteration for each chain. We can see that each chain starts in a different location, but they all converge on the same area. This is the convergence we talked about measuring with the R-hat parameter. Convergence is important because it ensures stable estimates. We can see that this model has not converged at the beginning, as the chains are in different places and not horizontal, meaning that the estimates are not stable.4. Sampling the Posterior Distribution
Because the model has not converged at the beginning, we discard these iterations, leaving only the converged iterations to make up our final posterior distribution. Here, we can see that by only using the final 1,000 iterations, all of the chains are fully mixed. The iterations that are discarded are know as warm-up, or burn-in. By default, the rstanarm package estimates 4 chains. Each chain is 2,000 iterations long, and the first 1,000 are discarded for warm-up. For the exercises in this course, to cut down on estimation time, we've changed the default to 2 chains, each with 1,000 iterations, and the first 500 discarded for warm-up. In your own work, we recommend using the rstanarm defaults.5. Changing the number and length of chains
We change this behavior in rstanarm by using the chains, iter, and warm-up arguments. Here we've specified that we want three chains, each with 1,000 iterations, and the first 500 should be discarded for warm-up. This means that our posterior distributions will be made of 1,500 total samples (500 from each chain).6. Changing the number and length of chains
Indeed, when we look at the summary of the model, under Model Info, we see that the sample is 1500.7. Too short chains
However, we have to be careful about making the number or length of the chains to short. Using our same example from earlier, if we had instead requested only 500 iterations, and discarded the first 250, our posterior distribution wouldn't be converged, because the chains haven't mixed.8. How many iterations?
Because of this, the number of iterations is a balancing act. Fewer iterations means the model estimates faster, but too few iterations may keep the model from converging. The number of iterations needed is different for each model, so it's important to pay attention to our R-hat values. If you estimate a model and it hasn't converged, increasing the number of iterations and the length of the warm-up is a good place to start.9. Let's practice!
Now let's try some examples.Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.