Reporting Bayesian results

1. Reporting Bayesian results

Welcome back! Let's now talk about how to report the results of a Bayesian analysis.

2. The honest way

The truly honest way of reporting Bayesian parameter estimates is to present the prior and the posterior of each parameter. Assuming you have the draws from the prior and posterior distributions in a numpy array, you can plot the two distributions with seaborn's kdeplot function.

3. The honest way

This provides your audience with full information:

4. The honest way

what are the most likely posterior values,

5. The honest way

what is the range of possible values,

6. The honest way

and how the prior influences the posterior. Unfortunately, with many parameters in the model, this approach becomes infeasible, so the information has to be summarized.

7. Bayesian point estimates

No single number can fully convey the complete information contained in a distribution. Yet, sometimes such information compression is necessary and we need a single number called a point estimate.

8. Bayesian point estimates

There are many different point estimates you can calculate. One is the expected value, or mean, of the posterior distribution, marked in red on the slide.

9. Bayesian point estimates

You can also summarize the posterior distribution by its median, shown in orange. It is such a number that we are 50% sure the parameter's value is at least this. The mean and the median can be easily computed with the corresponding numpy functions. For a symmetric distribution like the one in the picture, they are almost the same, but this does not hold for all distributions.

10. Bayesian point estimates

You can also calculate percentiles based on the posterior draws. The 75th percentile, computed with np-dot-percentile and marked in green, is such a value that 75% of the distribution mass is below it, which means we are 75% sure the parameter's value is at most this.

11. Credible intervals

In addition to providing a point estimate, it is a good practice to provide a measure of uncertainty in the estimation. This can be achieved by computing a credible interval. A credible interval is such an interval that the probability that the parameter falls inside it is 90%, for instance. The wider the credible interval, the larger the range of values the parameter is likely to take, and hence, the greater the uncertainty in its estimate. Remember that in the Bayesian world, a parameter is a random variable, so we can talk about the probability it falls into some interval. This is in contrast to the frequentist world, where it is the interval (called a confidence interval) that is random while the parameter is a fixed value. They can only make probabilistic statements about the interval, not the parameter. It is a subtle but important distinction.

12. Highest Posterior Density (HPD)

One way to compute a credible interval is by Highest Posterior Density, or HPD. To understand how the HPD intervals are constructed, imagine a horizontal line hovering above a distribution. It starts descending until it intersects with the density plot. The two points at which the intersection occurs mark the ends of the interval. The line descends further, thus widening the interval, until the probability mass inside the interval reaches the desired value. To calculate it, we will use the hdi function from the arviz package. arviz is a package for exploratory analysis of Bayesian models. We can pass the posterior draws to hdi, setting the highest density interval probability to 0-point-9, that is 90%. As a result, we obtain the lower and upper bounds of the interval, which we interpret as follows: the probability that the parameter lies between negative 4-point-86 and 4-point-96 is 90%.

13. Let's practice reporting Bayesian results!

Let's practice reporting Bayesian results!

Create Your Free Account

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.