Comparing frequentist and Bayesian methods
1. Comparing Bayesian and Frequentist Approaches
You may have noticed in the last lesson, that our estimates using the stan_glm function were very similar to our parameter estimates when using the lm function. If so, you're probably wondering why we need to even bother with Bayesian. We've touched on this a little bit, but it's worth taking some time now to really understand the differences between the frequentist and Bayesian camps.2. The same parameters!
When we look at the output from the lm and stan_glm models that we estimated earlier, we see that the estimates and standard errors are very similar. However, the output from the Bayesian model doesn't contain test statistics or p-values.3. Frequentist vs. Bayesian
This represents the fundamental difference between frequentists and Bayesian. In short, frequentists assume that model parameters are fixed and data is random, while Bayesian assume that the data is fixed and the model parameters are random. This can be seen in the interpretation of a p-value. The p-value is the probability of observing data that give rise to a test statistic that large, if the null hypothesis is true. In other words, given a set of true parameter values (the null hypothesis), what's the probability of observing a random dataset that results in a test statistic this large. In contrast Bayesian assume that the data collected are fixed, and there instead is a distribution of possible parameters values that give rise to the data. Said another way, Bayesians are interested in determining the range of values for parameters that would give rise to their observed dataset.4. Evaluating Bayesian parameters
However, it is often desirable to have a method for assessing whether or not a parameter estimated using Bayesian methods is meaningful or significant. For this, we'll use credible intervals. A credible interval will seem very similar to a confidence interval. A confidence interval tells us the probability that a range contains the true value. However, the confidence interval cannot tell us anything how probable any specific values for the parameter of interest are. This may seem like splitting hairs, but it is an important distinction to make in order to make sure we are making the correct inferences. In contrast, credible intervals tell us how probable values are. This allows us to make inferences about actual parameter values, rather than about boundaries of the ranges.5. Creating credible intervals
In rstanarm, we can easily calculate credible intervals using the `posterior_interval` function. By default, rstanarm provides the 90% credible interval, but we can create 95 or 50% credible intervals by supplying the desired interval to the prob argument.6. Confidence vs. Credible intervals
These intervals look very similar to the corresponding confidence interval. Here is the 95% confidence interval for the mom_iq parameter in our frequentist model. This tells us that there is a 95% chance that the range of 0.49 to 0.72 contains the true value. But we're interested in the probability of the value falling between two points, not the probability of the two points capturing the true value. This what the credible interval gives us. As you can see these intervals give very similar ranges. This may often be the case, but inferences are very different. In the Bayesian scenario, we can ask what is the probability that the parameter is between 0.60 and 0.65, and we see a 31% chance the true value is in that range. How would we do something similar with confidence intervals from a frequentist model? We can't. Only Bayesian methods allow us to make inferences about the actual values of the parameter.7. Let's practice!
Now it's your turn.Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.