Get startedGet started for free

A Bayesian model of water temperature

1. A Bayesian model of water temperature

Ok, let’s use the normal distribution to build a Bayesian model for the temperature data, to figure out if I should have my beach party or not. Before coding this up, let’s write down the model using the tilde-notation we looked at last chapter. So we had a number of

2. Let's define the model

water temperatures from five 20th of Julys, and we’re going to model them as coming from

3. Let's define the model

a normal distribution with some unknown mean mu, and unknown standard deviation sigma. Finally, we need to specify what the model knows about these two parameters before, prior to, being informed by the data, we need to define prior probability distributions. I would say that

4. Let's define the model

the standard deviation should be somewhere between 0 and 10, given what I know about Swedish water temperatures. The typical summer water temperature would be something around 18 degrees Celsius plus-minus 5. To represent this information

5. Let's define the model

we could use another normal distribution centered on 18, with a standard deviation of 5 as the prior for mu. We could have used many other priors, but let's go with this for now! Now we need to fit this model, and as in last chapter we’re going to use grid approximation. Let’s pull up the code from last chapter.

6. Let's fit the model

and let’s change this to fit the new normal model. We’re no longer dealing with ads, but with temperature, so let’s change that:

7. Let's fit the model

The unknown parameter is not proportions of clicks, it’s

8. Let's fit the model

the mean mu and the standard deviation sigma of the normal distribution and we want to define a grid over these two parameters that is wide and fine-grained enough.

9. Let's fit the model

Here I’m going with a grid over mu from 8 to 30 degrees, in steps of half degrees, and for sigma I go with 0.1 to 10 by steps of 0.3. Then we’ll use these two vectors with the expand.grid function to create a data frame that lists all possible combinations of parameters:

10. Let's fit the model

Now we have defined a grid over, hopefully, the relevant parts of parameter space. What’s parameter space? Well, It’s the space of all possible parameter combinations.

11. The parameter space

Here is what we have right now. A two-dimensional parameter space, where each dot here is one of the parameter combinations in pars. Having that we now need to define the right priors and likelihoods.

12. Let's fit the model

The prior over mu

13. Let's fit the model

was a normal distribution with mean 18 and standard deviation 5,

14. Let's fit the model

the prior over sigma was uniform from zero and ten. And finally, we can just

15. Let's fit the model

multiply these together to get a combined prior. The likelihood of the data given the parameters needs to be calculated for each parameter combination. So,

16. Let's fit the model

for each row i in pars we’ll calculate

17. Let's fit the model

the likelihoods of the data points assuming a normal distribution, and

18. Let's fit the model

then we’ll multiply these likelihoods together to get the combined likelihood of the five data points. The last part would be to calculate the posterior probability of each parameter combination, but fortunately, Bayes’ theorem stays the same, so we don’t have to change a thing! Now let’s run this, and let’s take a look at the probability distribution we get after, posterior to, having used the data.

19. The result

Here it is! Each square in this plot shows the probability of a parameter combination, with more probable parameter combinations having darker colors. Just eyeballing this it looks like the most probable mean temperature should be around 18 to 22 degrees. We’ll take a look at that in more detail soon, but first,

20. Replicate this analysis using zombie data!

take a stab at replicating this analysis in the following exercises, but using some zombie data instead.