Normal probabilities

1. Normal probabilities

You're familiar with the fundamentals of normal distributions; now we're going to calculate probabilities. Let's do it!

2. Probability density

Before we start, we have to import the norm object from the scipy dot stats library. This has to be done every time we need to use norm. In the rest of the lesson we will assume it is already imported. To calculate the probability density of a given value we use the probability density function, pdf. We pass the value we want to calculate, with the loc parameter for the mean and the scale parameter for the standard deviation. By default, loc is 0 and scale is 1 on all the functions available in the norm object.

3. pdf() vs. cdf()

Consider these two plots. What if we want to calculate the probability of getting a value below -1? The plot on the left is the probability density, with a green area. This area represents the probability of getting a value less than -1. On the right we have a plot of the cumulative distribution function (cdf), which gives us the probability of a value being in the green area. In this case, it's 0.15. The cumulative distribution function is an S-shaped function that allows us to calculate the probability of getting a value less than a given x.

4. pdf() vs. cdf() (Cont.)

Let's look at another example. We can see on the left the area we want to calculate, which is the probability of getting a value less than 1.5. On the right we can see that result of the cdf is 0.93.

5. pdf() vs. cdf() (Cont.)

And finally, for the area below the curve less than 5, we can see that the result of the cdf is almost 1.

6. Cumulative distribution function examples

We've seen that if you calculate norm dot cdf for -1 you get 0.15. If you want to know how probable it is to get a value less than 0.5, you can do that with norm dot cdf too: in this case the probability is 0.69.

7. The percent point function (ppf)

If instead you want to know the value where a given probability is accumulated, you use the percent point function, norm dot ppf. Notice the direction of the arrows from probability to values in the plot. For example, if you want to calculate the value in a normal distribution with a 0.2 probability of occurring, you use norm dot ppf of 0.2 and you get -0.8416. For 0.55 probability, you get 0.1256.

8. ppf() is the inverse of cdf()

As you've seen, we can take values and get probabilities with norm dot cdf and we can take probabilities to get values with norm dot ppf. One is the inverse of the other.

9. Probability between two values

If we want the probability of getting a value between -1 and 1, we take the value of cdf for 1 and subtract the value for -1, and we get 0.68.

10. Tail probability

If we instead want the probability of a random variable being greater than a given value, we can use norm dot sf with the desired value. sf stands for survival function, which is the complement of the cdf. The probability of getting a value greater than 1 is 0.15.

11. Tails

What if we want to calculate the probability of getting a value less than -2 and greater than 2? We just add the probabilities of each tail using cdf and sf.

12. Tails (Cont.)

The result is 0.045, which means there's only a 4.5% probability of a value being two standard deviations away from the mean. Tail probabilities are important to study extreme events.

13. Intervals

Finally, if we want to know the interval where any given probability concentrates, we can use norm dot interval and specify the probability. For 0.95, we get -1.95 and 1.95.

14. On to some practice!

Now let's calculate some normal probabilities.