1. Regression and forecasting
Let's dive into regression and predictions!
2. Linear regression
In a linear regression model we are modeling the response, y, as a linear combination of some predictors, x. We can model sales as a function of marketing spending, for instance. The intercept, Beta-zero, is sales level without any spending, and beta-one denotes the impact of marketing spending on sales.
In the frequentist world, betas are single numbers and we cannot find values for them that would make the equation always hold. Consequently, we add an error term denoted by the Greek letter epsilon and say that this error term has a normal distribution with mean zero and some standard deviation, sigma.
In the Bayesian world, we treat the response as a random variable, and assume that it has a normal distribution with the mean defined by the regression equation and some standard deviation.
Let's take a closer look at the normal distribution then!
3. Normal distribution
Let's sample draws from the normal distribution and plot them to see how it looks like. To do this, we use the np-dot-random-dot-normal function, with two parameters to set: the mean and the standard deviation, which we set to 0 and 1 respectively.
The normal distribution density, also called a bell curve, is symmetric around its mean and almost all the probability mass is within its mean plus/minus 3 standard deviations.
4. Normal distribution
By setting the mean to 3, the distribution shifts and centers around 3.
5. Normal distribution
By increasing the standard deviation to 3 instead, the distribution becomes wider and shorter.
6. Bayesian regression model definition
Let's now define our Bayesian regression model. We have already said that sales are normally distributed but a full model specification also requires the priors for all 3 parameters: the intercept, the spending impact, and the standard deviation.
We could use many priors, but let's use normal ones for the betas. Assume sales and spendings are in thousands of dollars. First, we expect $5000 sales without any marketing. Also, we expect $2000 increase in sales from each $1000 increase in spending, but are not certain, so we make the prior for beta-1 wider by setting its standard deviation to 10. We don't have any feelings about the regression standard deviation parameter, sigma, so we use the uniform prior.
7. Estimating regression parameters
How to get the posteriors? Grid approximation could work, but is impractical for more than one parameter.
We could choose conjugate priors so that we can simulate from the posterior, but the conjugate priors for linear regression are not very intuitive. And we want our priors!
Fortunately, we can simulate from the posterior even with non-conjugate priors using Markov Chain Monte Carlo, a technique which will be covered in the next chapter. For now, let's assume we have sampled the parameter draws and focus on working with them.
8. Plot posterior
A good practice is to analyze the posterior draws visually before we make any predictions with the model. Let's introduce another function from pymc3 called plot_posterior. You pass it the draws and set the credible interval, and it plots the density, marking its mean and the interval.
9. Posterior draws analysis
With many parameters, it's convenient to look at all of them at once. Once you have all three parameters sampled, you can collect them in a DataFrame and use the dot-describe method, which allows us to inspect the descriptive statistics of the posterior draws.
10. Predictive distribution
Time to make predictions. How much sales can we expect if we spend $1000 on marketing? To calculate it, we first get the point estimates of the parameters, in this case the posterior mean. Then, we calculate the mean of the sales distribution according to the regression formula, setting 1000 as the marketing spending. Finally, we simulate from the predictive distribution to get the sales forecast.
11. Predictive distribution
And here is the result! With $1000 marketing spending, we can expect slightly less than $6000 in sales.
12. Let's regress and forecast!
Let's practice!