Get startedGet started for free

Interpreting confidence and prediction intervals

1. Interpreting confidence and prediction intervals

Let's look at confidence and prediction intervals.

2. Why do we need intervals?

Making decisions based on data often involves uncertainty. Whether estimating sales, forecasting trends, or predicting individual outcomes, it is essential to account for variability in the data. Confidence intervals and prediction intervals help quantify and communicate this uncertainty, by showing a range of values, instead of a single data point.

3. Confidence intervals

When taking a sample to estimate a statistic like the mean, we use a confidence interval to measure the reliability of this estimate. A confidence interval provides a range of values that is likely to contain the true population parameter, based on the sample data. Conventionally, a 95% confidence interval is used. This means that if we were to repeatedly take samples and calculate the mean each time, the interval would contain the true mean in 95% of samples.

4. Confidence interval example

Say you are surveying a sample of customers, and find an average spending of $50. To account for uncertainty, you calculate a 95% confidence interval of $47 to $53. If you were to repeat this survey multiple times, in 95% of samples the true average would be between $47 and $53. In this way, the confidence interval helps you to account for sample variability.

5. Prediction intervals

A prediction interval specifically estimates the range within which an individual future data point is likely to fall. It is derived from a predictive model, which accounts for both the variability in the data and the uncertainty in future observations. Since individual data points are subject to more variability than an estimated parameter, prediction intervals are typically wider than confidence intervals.

6. Prediction interval example

Consider a company that uses regression analysis to predict customer spending based on age. If the model predicts that a 30-year-old customer will spend $50, the prediction interval might range from $40 to $60, reflecting the higher uncertainty associated with predicting an individual's spending behavior rather than the population's average.

7. Confidence vs. prediction intervals

While both confidence and prediction intervals help interpret uncertainty in data, they serve different purposes. Confidence intervals estimate the likely range of a statistical parameter, such as a mean. Prediction intervals estimate the likely range of an individual observation. While confidence intervals are based on the sampling process used to estimate the statistical parameter, prediction intervals are based on a predictive model. Prediction intervals are typically wider because they account for both the uncertainty in estimating the population mean and the natural variation among individual observations.

8. Example: risk management

Suppose you're forecasting the return on an investment portfolio over the next year. Historical data suggests an average annual return of 8%.

9. Example: risk management

Using statistical software, the analyst calculates a 95% confidence interval for the portfolio's average return as 7% to 9%. This means the true average return across many similar portfolios would fall within this range 95% of the time. If the analyst wants to predict the return for a single investment in the portfolio, the prediction interval might be much wider, such as 4% to 12%. This accounts for the higher variability in individual investments compared to the overall portfolio's average performance. The confidence interval helps an investor understand the expected long-term performance of an asset class, whereas a prediction interval provides a realistic range for what might happen to a specific investment in a single year.

10. Let's practice!

Over to you!