1. We can calculate!
So we can calculate new probabilities using the sum and the product rules. But in earlier chapters we didn’t calculate probabilities directly, we simulated.
2. Simulation vs calculation
If we wanted to figure out the probability that a generative model would generate a certain value, we simulated a large number of samples and then counted the proportion of samples taking on this certain value. This was easy to do for generative models that correspond to common probability distributions, like the binomial or the Poisson distribution, as you can use the “r”-functions, like rbinom and rpois for this. For example, here is how we got p(n_visitors = 13 | prob_success = 10%), the probability that 13 out of a 100 would visit your site if the underlying proportion of clicks was 10%. But one thing that faster Bayesian computational methods have in common is that they require that this probability can be calculated directly rather than simulated. Fortunately, for these common distributions someone already figured out how to do this and in R we can use the “d”-functions, like dbinom and dpois, for this. Using dbinom we can directly calculate the probability that 13 out of a 100 would visit your site, like this. This is obviously much more efficient than to have to generate 100,000 or so samples first. We can manipulate the resulting probabilities using the rules we just learned. For example,
3. Simulation vs calculation
the probability of getting 13 or 14 visitors would be 12.6% percent. Here calculated using the sum rule. Finally, if we want to get a whole probability distribution, we’ll have to calculate the probability for a range of values. But the d-functions are generally vectorized, so that is easy! If we wanted the probability distribution over the number of visitors we would expect conditional on the proportion of success being 10% we could calculate it like this. Instead of the resulting probability distribution being represented by a large number of samples, as in earlier chapters, it’s now represented by a vector giving the probability of each outcome. And we can plot it like this.
4. Plotting a calculated distribution
The binomial is an example of a discrete distribution; it defines probability over whole numbers: 1, 2, 3, and so on. But there are also
5. Continuous distributions
continuous distributions, and earlier we used the uniform distribution like this. There is also a d-version of runif: dunif,
6. Continuous distributions
and you might expect that this would return the probability of the outcome being 0.12 assuming a uniform distribution between 0 and 0.2. But it does not. For starters, a probability can at most be 1, and here we got a 5 back. That’s because for continuous distributions you can’t really talk about the probability of a single value, the probability of getting exactly 0.12 is effectively zero. So instead you get the probability density, a number that on its own doesn’t tell you much, but that you can view as a relative probability: A value with twice the probability density as another value is twice as probable. This is also where the d in dunif and dbinom comes from, it stands for density. Since we’re here using uniform distribution, we should not be surprised that any value between 0 and 0.2 returns the same number: 5. Within a uniform probability distribution, any value is equally likely.
7. Try this out!
Now, try out what I’ve shown you here in some exercises.