Regression and forecasting

1. Regression and forecasting

Let's dive into regression and predictions!

2. Linear regression

In a linear regression model we are modeling the response, y, as a linear combination of some predictors, x. We can model sales as a function of marketing spending, for instance. The intercept, Beta-zero, is sales level without any spending, and beta-one denotes the impact of marketing spending on sales. In the frequentist world, betas are single numbers and we cannot find values for them that would make the equation always hold. Consequently, we add an error term denoted by the Greek letter epsilon and say that this error term has a normal distribution with mean zero and some standard deviation, sigma. In the Bayesian world, we treat the response as a random variable, and assume that it has a normal distribution with the mean defined by the regression equation and some standard deviation. Let's take a closer look at the normal distribution then!

3. Normal distribution

Let's sample draws from the normal distribution and plot them to see how it looks like. To do this, we use the np-dot-random-dot-normal function, with two parameters to set: the mean and the standard deviation, which we set to 0 and 1 respectively. The normal distribution density, also called a bell curve, is symmetric around its mean and almost all the probability mass is within its mean plus/minus 3 standard deviations.

4. Normal distribution

By setting the mean to 3, the distribution shifts and centers around 3.

5. Normal distribution

By increasing the standard deviation to 3 instead, the distribution becomes wider and shorter.

6. Bayesian regression model definition

Let's now define our Bayesian regression model. We have already said that sales are normally distributed but a full model specification also requires the priors for all 3 parameters: the intercept, the spending impact, and the standard deviation. We could use many priors, but let's use normal ones for the betas. Assume sales and spendings are in thousands of dollars. First, we expect $5000 sales without any marketing. Also, we expect $2000 increase in sales from each $1000 increase in spending, but are not certain, so we make the prior for beta-1 wider by setting its standard deviation to 10. We don't have any feelings about the regression standard deviation parameter, sigma, so we use the uniform prior.

7. Estimating regression parameters

How to get the posteriors? Grid approximation could work, but is impractical for more than one parameter. We could choose conjugate priors so that we can simulate from the posterior, but the conjugate priors for linear regression are not very intuitive. And we want our priors! Fortunately, we can simulate from the posterior even with non-conjugate priors using Markov Chain Monte Carlo, a technique which will be covered in the next chapter. For now, let's assume we have sampled the parameter draws and focus on working with them.

8. Plot posterior

A good practice is to analyze the posterior draws visually before we make any predictions with the model. Let's introduce another function from pymc3 called plot_posterior. You pass it the draws and set the credible interval, and it plots the density, marking its mean and the interval.

9. Posterior draws analysis

With many parameters, it's convenient to look at all of them at once. Once you have all three parameters sampled, you can collect them in a DataFrame and use the dot-describe method, which allows us to inspect the descriptive statistics of the posterior draws.

10. Predictive distribution

Time to make predictions. How much sales can we expect if we spend $1000 on marketing? To calculate it, we first get the point estimates of the parameters, in this case the posterior mean. Then, we calculate the mean of the sales distribution according to the regression formula, setting 1000 as the marketing spending. Finally, we simulate from the predictive distribution to get the sales forecast.

11. Predictive distribution

And here is the result! With $1000 marketing spending, we can expect slightly less than $6000 in sales.

12. Let's regress and forecast!

Let's practice!

This exercise is part of the course

Bayesian Data Analysis in Python

IntermediateSkill Level

4.8+

Start Course for Free

Take your first steps in the Bayesian world. In this chapter, you’ll be introduced to the basic concepts of probability and statistical distributions, as well as to the famous Bayes' Theorem, the cornerstone of Bayesian methods. Finally, you’ll build your first Bayesian model to draw conclusions from randomized coin tosses.

Exercise 1: Who is Bayes? What is Bayes?Exercise 2: Bayesians vs. Frequentists Exercise 3: Probability distributions Exercise 4: Probability and Bayes' Theorem Exercise 5: Let's play cards Exercise 6: Bayesian spam filter Exercise 7: What does the test say?Exercise 8: Tasting the Bayes Exercise 9: Tossing a coin Exercise 10: The more you toss, the more you learn Exercise 11: Hey, is this coin fair?

It’s time to look under the Bayesian hood. You’ll learn how to apply Bayes' Theorem to drug-effectiveness data to estimate the parameters of probability distributions using the grid approximation technique, and update these estimates as new data become available. Next, you’ll learn how to incorporate prior knowledge into the model before finally practicing the important skill of reporting results to a non-technical audience.

Exercise 1: Under the Bayesian hood Exercise 2: Towards grid approximation Exercise 3: Grid approximation without prior knowledge Exercise 4: Updating posterior belief Exercise 5: Prior belief Exercise 6: The truth of the prior Exercise 7: Picking the right prior Exercise 8: Simulating posterior draws Exercise 9: Reporting Bayesian results Exercise 10: Point estimates Exercise 11: Highest Posterior Density credible intervals Exercise 12: The meaning of credibility

Apply your newly acquired Bayesian data analysis skills to solve real-world business challenges. You’ll work with online sales marketing data to conduct A/B tests, decision analysis, and forecasting with linear regression models.

Exercise 1: A/B testing Exercise 2: Simulate beta posterior Exercise 3: Posterior click rates Exercise 4: A or B, and how sure are we?Exercise 5: How bad can it be?Exercise 6: Decision analysis Exercise 7: Decision analysis: cost Exercise 8: Decision analysis: profit Exercise 9: Regression and forecasting

Current Exercise

Exercise 10: Defining a Bayesian regression model Exercise 11: Analyzing regression parameters Exercise 12: Predictive distribution

In this final chapter, you’ll take advantage of the powerful PyMC3 package to easily fit Bayesian regression models, conduct sanity checks on a model's convergence, select between competing models, and generate predictions for new data. To wrap up, you’ll apply what you’ve learned to find the optimal price for avocados in a Bayesian data analysis case study. Good luck!

Exercise 1: Markov Chain Monte Carlo and model fitting Exercise 2: Markov Chain Monte Carlo Exercise 3: Sampling posterior draws Exercise 4: Interpreting results and comparing models Exercise 5: Inspecting posterior draws Exercise 6: Comparing models with WAIC Exercise 7: Making predictions Exercise 8: Sample from predictive density Exercise 9: Estimating test error Exercise 10: How much is an avocado?Exercise 11: Fitting the model Exercise 12: Inspecting the model Exercise 13: Optimizing the price Exercise 14: Final remarks