1. Bayesian calculation
In chapter two we simulated a large number of samples to end up with
2. Sampled joint distribution
a joint distribution over the uncertain parameter, the underlying proportion of clicks, and the future data, how many clicks we would get out of a hundred shown ads. Then we conditioned on the data we got,
3. Conditioned sampled joint distribution
thirteen visitors, and that allowed the model to become more certain about the underlying proportion of clicks. This, to condition on the data, is what Bayesian inference is. Now, you’ve learned the basic probability rules. You’ve learned how to calculate probabilities and densities using probability distributions. We’re going to put this together to do Bayesian inference.
4. Bayesian inference by calculation
But instead of simulation, as in chapter two, we’ll calculate everything directly in R, but using the same example as in chapter 2. First, let’s state what we know. As before, we know that we’re going to show a 100 ads.
5. Bayesian inference by calculation
What we don’t know is
6. Bayesian inference by calculation
we can dothe underlying proportion of clicks and the number of clicks and visitors our 100 shown ads will generate. We’re now going to list all possible combinations of these two unknowns. This is easy for n_visitors,
7. Bayesian inference by calculation
we get at least zero and at most hundred. The underlying proportion of clicks is continuous from zero to one so we can't do all combinations but at least we can do
8. Bayesian inference by calculation
a fine grid of values. We’ll then use the
9. Bayesian inference by calculation
expand-dot-grid function which will generate all possible combinations of n_visitors and proportion_clicks, and put them neatly into a data frame called pars, as in parameters. Shown in the table here is just a sample from pars, but pars really lists every combination of proportion_clicks and n_visitors. Now we’ll add the rest of the model to this data frame. First, the prior for proportion_clicks.
10. Bayesian inference by calculation
This is what we did when we sampled in chapter 2. Now instead we’ll calculate it for each proportion_clicks value in pars:
11. Bayesian inference by calculation
As you can see in the table, it becomes the same density, 5, for each value below 0.2, and is otherwise zero. Then we had the probability distribution over future data, n_visitors, given proportion_clicks.
12. Bayesian inference by calculation
Which looked like this when we sampled. Now, we’ll also calculate this for each row in pars:
13. Bayesian inference by calculation
I’m adding it to a new column called “likelihood”, as this quantity is often just called the “likelihood” of the data. Finally, we’re ready to calculate the probability of each proportion_clicks and n_visitors combination. Which is a combination of the likelihood of the data given the parameter value and how likely that parameter value was, to begin with. To calculate the probability of this and that we can use multiplication, so the probability is
14. Bayesian inference by calculation
the likelihood times the prior. Almost. Because the total probability is now
15. Bayesian inference by calculation
105. But probability should be a number between 0 and 1,
16. Bayesian inference by calculation
so we’ll have to divide what we just calculated by the total sum, to normalize it into a probability distribution.
17. Bayesian inference by calculation
There, that fixed it. In chapter two, after having done all that sampling, we ended up with
18. Sampled joint plot
this probability distribution. Now, after all this calculation, if we plot the probability of each proportion_clicks and n_visitors combination, we get this:
19. Calculated joint plot
Pretty similar, right? Finally, let’s bring in the actual data.
20. Bayesian inference by calculation
As there were 13 clicks out of a hundred shown ads, we can now
21. Bayesian inference by calculation
remove all rows where n_visitors is something else. And while we’re at it we’ll normalize again, to make sure the remaining probabilities sum to one. Here’s the result:
22. Calculated marginal joint distributions
If we now zoom in on this slice and plot from the side, instead of looking at it from above,
23. Marginal distribution: simulated vs. sampled 1
we see that we have retrieved the same probability distribution over the likely underlying proportions of clicks as we got when we simulated in chapter two.
24. Marginal distribution: simulated vs. sampled 2
But running the simulation is much slower on my computer than calculating it directly! That was a lot of calculation,
25. Calculate for yourself!
but it’s not over because now you get to try it out yourself in a couple of exercises!