Get startedGet started for free

Bayesian calculation

1. Bayesian calculation

In chapter two we simulated a large number of samples to end up with

2. Sampled joint distribution

a joint distribution over the uncertain parameter, the underlying proportion of clicks, and the future data, how many clicks we would get out of a hundred shown ads. Then we conditioned on the data we got,

3. Conditioned sampled joint distribution

thirteen visitors, and that allowed the model to become more certain about the underlying proportion of clicks. This, to condition on the data, is what Bayesian inference is. Now, you’ve learned the basic probability rules. You’ve learned how to calculate probabilities and densities using probability distributions. We’re going to put this together to do Bayesian inference.

4. Bayesian inference by calculation

But instead of simulation, as in chapter two, we’ll calculate everything directly in R, but using the same example as in chapter 2. First, let’s state what we know. As before, we know that we’re going to show a 100 ads.

5. Bayesian inference by calculation

What we don’t know is

6. Bayesian inference by calculation

we can dothe underlying proportion of clicks and the number of clicks and visitors our 100 shown ads will generate. We’re now going to list all possible combinations of these two unknowns. This is easy for n_visitors,

7. Bayesian inference by calculation

we get at least zero and at most hundred. The underlying proportion of clicks is continuous from zero to one so we can't do all combinations but at least we can do

8. Bayesian inference by calculation

a fine grid of values. We’ll then use the

9. Bayesian inference by calculation

expand-dot-grid function which will generate all possible combinations of n_visitors and proportion_clicks, and put them neatly into a data frame called pars, as in parameters. Shown in the table here is just a sample from pars, but pars really lists every combination of proportion_clicks and n_visitors. Now we’ll add the rest of the model to this data frame. First, the prior for proportion_clicks.

10. Bayesian inference by calculation

This is what we did when we sampled in chapter 2. Now instead we’ll calculate it for each proportion_clicks value in pars:

11. Bayesian inference by calculation

As you can see in the table, it becomes the same density, 5, for each value below 0.2, and is otherwise zero. Then we had the probability distribution over future data, n_visitors, given proportion_clicks.

12. Bayesian inference by calculation

Which looked like this when we sampled. Now, we’ll also calculate this for each row in pars:

13. Bayesian inference by calculation

I’m adding it to a new column called “likelihood”, as this quantity is often just called the “likelihood” of the data. Finally, we’re ready to calculate the probability of each proportion_clicks and n_visitors combination. Which is a combination of the likelihood of the data given the parameter value and how likely that parameter value was, to begin with. To calculate the probability of this and that we can use multiplication, so the probability is

14. Bayesian inference by calculation

the likelihood times the prior. Almost. Because the total probability is now

15. Bayesian inference by calculation

105. But probability should be a number between 0 and 1,

16. Bayesian inference by calculation

so we’ll have to divide what we just calculated by the total sum, to normalize it into a probability distribution.

17. Bayesian inference by calculation

There, that fixed it. In chapter two, after having done all that sampling, we ended up with

18. Sampled joint plot

this probability distribution. Now, after all this calculation, if we plot the probability of each proportion_clicks and n_visitors combination, we get this:

19. Calculated joint plot

Pretty similar, right? Finally, let’s bring in the actual data.

20. Bayesian inference by calculation

As there were 13 clicks out of a hundred shown ads, we can now

21. Bayesian inference by calculation

remove all rows where n_visitors is something else. And while we’re at it we’ll normalize again, to make sure the remaining probabilities sum to one. Here’s the result:

22. Calculated marginal joint distributions

If we now zoom in on this slice and plot from the side, instead of looking at it from above,

23. Marginal distribution: simulated vs. sampled 1

we see that we have retrieved the same probability distribution over the likely underlying proportions of clicks as we got when we simulated in chapter two.

24. Marginal distribution: simulated vs. sampled 2

But running the simulation is much slower on my computer than calculating it directly! That was a lot of calculation,

25. Calculate for yourself!

but it’s not over because now you get to try it out yourself in a couple of exercises!