Normal probabilities

1. Normal probabilities

You're familiar with the fundamentals of normal distributions; now we're going to calculate probabilities. Let's do it!

2. Probability density

Before we start, we have to import the norm object from the scipy dot stats library. This has to be done every time we need to use norm. In the rest of the lesson we will assume it is already imported. To calculate the probability density of a given value we use the probability density function, pdf. We pass the value we want to calculate, with the loc parameter for the mean and the scale parameter for the standard deviation. By default, loc is 0 and scale is 1 on all the functions available in the norm object.

3. pdf() vs. cdf()

Consider these two plots. What if we want to calculate the probability of getting a value below -1? The plot on the left is the probability density, with a green area. This area represents the probability of getting a value less than -1. On the right we have a plot of the cumulative distribution function (cdf), which gives us the probability of a value being in the green area. In this case, it's 0.15. The cumulative distribution function is an S-shaped function that allows us to calculate the probability of getting a value less than a given x.

4. pdf() vs. cdf() (Cont.)

Let's look at another example. We can see on the left the area we want to calculate, which is the probability of getting a value less than 1.5. On the right we can see that result of the cdf is 0.93.

5. pdf() vs. cdf() (Cont.)

And finally, for the area below the curve less than 5, we can see that the result of the cdf is almost 1.

6. Cumulative distribution function examples

We've seen that if you calculate norm dot cdf for -1 you get 0.15. If you want to know how probable it is to get a value less than 0.5, you can do that with norm dot cdf too: in this case the probability is 0.69.

7. The percent point function (ppf)

If instead you want to know the value where a given probability is accumulated, you use the percent point function, norm dot ppf. Notice the direction of the arrows from probability to values in the plot. For example, if you want to calculate the value in a normal distribution with a 0.2 probability of occurring, you use norm dot ppf of 0.2 and you get -0.8416. For 0.55 probability, you get 0.1256.

8. ppf() is the inverse of cdf()

As you've seen, we can take values and get probabilities with norm dot cdf and we can take probabilities to get values with norm dot ppf. One is the inverse of the other.

9. Probability between two values

If we want the probability of getting a value between -1 and 1, we take the value of cdf for 1 and subtract the value for -1, and we get 0.68.

10. Tail probability

If we instead want the probability of a random variable being greater than a given value, we can use norm dot sf with the desired value. sf stands for survival function, which is the complement of the cdf. The probability of getting a value greater than 1 is 0.15.

11. Tails

What if we want to calculate the probability of getting a value less than -2 and greater than 2? We just add the probabilities of each tail using cdf and sf.

12. Tails (Cont.)

The result is 0.045, which means there's only a 4.5% probability of a value being two standard deviations away from the mean. Tail probabilities are important to study extreme events.

13. Intervals

Finally, if we want to know the interval where any given probability concentrates, we can use norm dot interval and specify the probability. For 0.95, we get -1.95 and 1.95.

14. On to some practice!

Now let's calculate some normal probabilities.

This exercise is part of the course

Foundations of Probability in Python

IntermediateSkill Level

4.8+

Start Course for Free

A coin flip is the classic example of a random experiment. The possible outcomes are heads or tails. This type of experiment, known as a Bernoulli or binomial trial, allows us to study problems with two possible outcomes, like “yes” or “no” and “vote” or “no vote.” This chapter introduces Bernoulli experiments, binomial distributions to model multiple Bernoulli trials, and probability simulations with the scipy library.

Exercise 1: Let’s flip a coin in Python Exercise 2: Flipping coins Exercise 3: Using binom to flip even more coins Exercise 4: Probability mass and distribution functions Exercise 5: Predicting the probability of defects Exercise 6: Predicting employment status Exercise 7: Predicting burglary conviction rate Exercise 8: Expected value, mean, and variance Exercise 9: Calculating the expected value and variance Exercise 10: Calculating the sample mean Exercise 11: Checking the result Exercise 12: Calculating the mean and variance of a sample

In this chapter you'll learn to calculate various kinds of probabilities, such as the probability of the intersection of two events and the sum of probabilities of two events, and to simulate those situations. You'll also learn about conditional probability and how to apply Bayes' rule.

Exercise 1: Calculating probabilities of two events Exercise 2: Any overlap?Exercise 3: Measuring a sample Exercise 4: Joint probabilities Exercise 5: Deck of cards Exercise 6: Conditional probabilities Exercise 7: Delayed flights Exercise 8: Contingency table Exercise 9: More cards Exercise 10: Total probability law Exercise 11: Formula 1 engines Exercise 12: Voters Exercise 13: Bayes' rule Exercise 14: Conditioning Exercise 15: Factories and parts Exercise 16: Swine flu blood test

Until now we've been working with binomial distributions, but there are many probability distributions a random variable can take. In this chapter we'll introduce three more that are related to the binomial distribution: the normal, Poisson, and geometric distributions.

Exercise 1: Normal distributions Exercise 2: Range of values Exercise 3: Plotting normal distributions Exercise 4: Within three standard deviations Exercise 5: Normal probabilities

Current Exercise

Exercise 6: Restaurant spending example Exercise 7: Smartphone battery example Exercise 8: Adults' heights example Exercise 9: Poisson distributions Exercise 10: ATM example Exercise 11: Highway accidents example Exercise 12: Generating and plotting Poisson distributions Exercise 13: Geometric distributions Exercise 14: Catching salmon example Exercise 15: Free throws example Exercise 16: Generating and plotting geometric distributions

No that you know how to calculate probabilities and important properties of probability distributions, we'll introduce two important results: the law of large numbers and the central limit theorem. This will expand your understanding on how the sample mean converges to the population mean as more data is available and how the sum of random variables behaves under certain conditions. We will also explore connections between linear and logistic regressions as applications of probability and statistics in data science.

Exercise 1: From sample mean to population mean Exercise 2: Generating a sample Exercise 3: Calculating the sample mean Exercise 4: Plotting the sample mean Exercise 5: Adding random variables Exercise 6: Sample means Exercise 7: Sample means follow a normal distribution Exercise 8: Adding dice rolls Exercise 9: Linear regression Exercise 10: Fitting a model Exercise 11: Predicting test scores Exercise 12: Studying residuals Exercise 13: Logistic regression Exercise 14: Fitting a logistic model Exercise 15: Predicting if students will pass Exercise 16: Passing two tests Exercise 17: Wrapping up