Get startedGet started for free

Discrete distributions

1. Discrete distributions

Now we will look at probability distributions.

2. Rolling the dice

Let's consider rolling a standard, six-sided die.

3. Rolling the dice

There are six possible outcomes and each has a one-sixth chance of occurring. This is an example of a probability distribution.

4. Choosing salespeople

This is similar to our earlier scenario, except we had names instead of numbers. Just like rolling a die, each outcome, or name, had an equal chance of occurring.

5. Probability distribution

A probability distribution describes the probability of each possible outcome in a scenario. We can also find the expected value of a distribution, which is the mean. We calculate this by multiplying each value by its probability, one-sixth in this case, and adding everything together. So the expected value of rolling a fair die is 3.5.

6. Why are probability distributions important?

Why is it important to understand probability distributions. Well, they help us to quantify risk and inform decision making. Also, as we will see later in the course, probability distributions are used in hypothesis testing to understand whether results may have occurred by chance.

7. Visualizing a probability distribution

We can visualize a probability distribution using a histogram, where each bar represents an outcome, and each bar's height represents the probability of that outcome.

8. Probability = area

We can calculate probabilities of different outcomes by taking areas of the probability distribution. For example, what's the probability that our die roll is less than or equal to two? To figure this out, we'll take the area of each bar representing an outcome of two or less.

9. Probability = area

Each bar has a width of one and a height of one-sixth, so the area of each bar is one-sixth. Summing the areas for one and two, we get a probability of one-third.

10. Uneven die

Now let's say we have a die where the two got turned into a three. This means we now have a zero percent chance of getting a two, and a 33% chance of getting a three. To calculate the expected value of this die, we now multiply two by zero, since it's impossible to get a two, and three by its new probability, one-third. This gives us an expected value of 3.67.

11. Visualizing uneven probabilities

When we visualize these new probabilities, the bars are no longer even.

12. Adding areas

With this die, what's the probability of getting something less than or equal to two? There's a one-sixth probability of getting one, and zero probability of getting two,

13. Adding areas

which sums to one sixth.

14. Discrete probability distributions

The probability distributions we've seen so far are discrete, since they represent situations with discrete outcomes. Therefore, they represent count or interval data. In the case of a die, we're counting dots, so we can't roll a 1.5 or 4.3. When all outcomes have the same probability, like a fair die, this is called a discrete uniform distribution.

15. Sampling from a discrete distribution

Just like we sampled names from a box, we can do the same thing with dice rolls. Here are the potential outcomes of a roll. Its expected value is 3.5. If we roll a die 10 times we are sampling with replacement as we can get the same result more than once. Here four rolls produced a two.

16. Visualizing a sample

We can visualize the outcomes of the 10 rolls using a histogram.

17. Sample distribution vs theoretical distribution

As the sample was random we have different numbers, despite there being the same probability of rolling each number. The mean of our sample is 3.0, which isn't super close to the 3.5 we were expecting.

18. A bigger sample

If we roll the die 100 times, the distribution of the rolls looks a bit more even, and the mean is closer to 3.5.

19. An even bigger sample

If we roll 1000 times, it looks even more like the theoretical probability distribution and the mean closely matches 3.5.

20. Law of large numbers

This is called the law of large numbers! If we increase the size of the sample then its mean will approach the theoretical mean.

21. Let's practice!

Time to solidify our knowledge of probability distributions.