Probability mass and distribution functions

1. Probability mass and distribution functions

After conducting many random experiments, you will notice that some outcomes are more likely than others. This is called a probability distribution. There are two important functions that are useful for probability calculations: the probability mass function and the cumulative distribution function.

2. Probability mass function (pmf)

A discrete random variable has a finite number of possible outcomes. The probability mass function allows you to calculate the probability of getting a particular outcome for a discrete random variable. The binomial probability mass function allows you to calculate the probability of getting k heads from n coin flips with p probability of getting heads.

3. Probability mass function (pmf)

The formula multiplies the number of different ways that you can get k successes out of n coin flips...

4. Probability mass function (pmf) (Cont.)

by the probability of success raised to the number of successes, k...

5. Probability mass function (pmf) (Cont.)

by the probability of failure, 1 - p, raised to the number of failures, n - k. It's okay if you don't understand the formula right now. With practice, your intuition about this will grow.

6. Probability mass function (pmf)

If we plot the probability mass function of getting k heads out of 10 fair coin flips, you can see that 5 is the most likely outcome.

7. Probability mass function (pmf) (Cont.)

With the scipy dot stats library we can use the binom dot pmf function to calculate this probability.

8. Calculating probabilities with `binom.pmf()`

If you use binom dot pmf with parameters k equals 2, n equals 10, and p equals 0.5 you get the probability of getting 2 heads from 10 flips of a fair coin -- that is, 4%. The probability of getting 5 heads from 10 coin flips is almost 25%.

9. Calculating probabilities with binom.pmf() (Cont.)

The probability of getting 50 heads out of 100 flips of a biased coin with 30% probability of getting heads is extremely small: not even a 1% chance. If instead you calculate the probability of getting 65 heads from 100 flips of a biased coin with 70% probability of getting heads, you see that it's almost 5%. As n gets larger, the probability of getting k heads becomes smaller for the same p.

10. Probability distribution function (cdf)

If you instead want to calculate the probability of getting k or fewer heads from n throws, you use the binomial probability distribution function, which adds the probabilities of...

11. Probability distribution function (cdf) (Cont.)

getting 0 heads out of n flips...

12. Probability distribution function (cdf) (Cont.)

getting heads once out of n flips...

13. Probability distribution function (cdf) (Cont.)

and getting all the way up to k heads out of n flips.

14. Cumulative distribution function (cdf)

The binomial probability distribution function allows us to calculate the cumulative probability of getting k heads or fewer from n coin flips with p probability of getting heads. In Python we use the binom dot cdf function with parameters k, n, and p. Adding the probabilities from the mass function, we get the cumulative distribution function (cdf). This is a way of getting a range of probabilities rather than the probability of a single event.

15. Cumulative distribution function (cdf) (Cont.)

With the scipy dot stats library, we can use the binom dot cdf function to get such a probability using the same parameters.

16. Calculating cumulative probabilities

If you use binom dot cdf with parameters k equals 5, n equals 10, and p equals 0.5 you get the probability of getting heads 5 times or fewer out of 10 flips, which is 62%. The probability of getting heads 50 times or fewer out of 100 flips of a biased coin with 30% probability of getting heads is near 100%. It's almost guaranteed.

17. Calculating cumulative probabilities (Cont.)

The probability of getting heads more than 59 times from 100 flips of a biased coin with p equal to 70% is 99% -- again, it's almost certain. What if we want the probability of getting heads more than k times? This is called the complement, and we get it by subtracting the CDF from 1. Alternatively, we can calculate the complement using the function binom dot sf with the same parameters. sf stands for survival function, which allows you to get tail probabilities, or the complement in this case.

18. Let's calculate some probabilities

We've had some fun calculating probabilities. Now let's practice some more.

Create Your Free Account

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.