1. Poisson distributions
The Poisson distribution is a very useful type of probability distribution that can model the frequency with which an event occurs during a fixed interval of time.
2. Poisson modeling
Suppose the mean number of call center calls per minute is 2.2.
What is the probability of having 3 calls in any minute?
Calling a call center is an example of a Poisson process. Other examples include visiting a bank branch, finishing a course on DataCamp, and so on.
3. Poisson distribution properties
Before doing any calculations, let's study the most important properties.
In Poisson distributions, your outcomes can be classified as successes or failures, and the average number of successful events per unit is known.
In the call center example, a call is a success and the average number of successful events per unit is the number of calls per minute.
In this lesson, we will not go into detail about the mathematical formulas for the Poisson distribution. If you're interested, you can find more information online.
4. Probability mass function (pmf)
Let's do a few probability calculations with the Poisson distribution.
First, we'll calculate the pmf.
5. Probability mass function (pmf) (Cont.)
Suppose we know that the average number of calls per minute to the call center is 2.2, and we want to know the probability of having 3 phone calls in a minute.
To find this we use the probability mass function and specify mu, the mean of the distribution.
First we import the poisson object from scipy dot stats, then we call poisson dot pmf with k equals 3 and mu equals 2.2.
We will use the same mu throughout the lesson.
The result is 0.196.
6. pmf examples
If we want the probability of having no calls in a minute, we call poisson dot pmf with k equals 0, and we get 0.11.
If we instead want the probability of having 6 calls in a minute, we get 0.017.
7. Different means
Take a look at these plots. You'll notice that for different means, the shape of the distribution varies.
When the mean is small, the probability of having 0 events is higher. As the mean gets higher, the curve moves to the right.
Let's study the cdf now.
8. Cumulative distribution function (cdf)
If we want to know the probability of having 2 or fewer phone calls in a minute, we use cdf.
In the plot on the left, we call poisson dot cdf and specify k equals 2 to get 0.62.
On the right, to find the probability of having 5 or fewer calls in a minute, we specify k equals 5 to get 0.97.
9. Survival function and percent point function (ppf)
To calculate the probability of having more than 2 calls in a minute, we use the survival function, sf. With k equals 2, we get 0.38.
If we instead want the value where we accumulate a given probability, we use the percent point function, ppf.
For 0.5 probability we get a value of 2. In the plot, notice the arrow that goes from the probability to the associated value.
10. Sample generation (rvs)
Finally, suppose we want to generate 10,000 samples of a Poisson random variable with mean 2.2. We use the rvs function for this.
We first import poisson from scipy dot stats, matplotlib dot pyplot as plt, and seaborn as sns.
Then we call rvs and specify mu, the size of the sample, and random_state equals 13.
We generate the plot by calling sns dot distplot with sample as a parameter and kde equals False.
The result is...
11. Sample generation (Cont.)
This beautiful plot with the frequency of each possible outcome in each bar. Notice that the sum of all the frequencies is 10,000.
12. Let's practice with Poisson
You're doing great -- now let's practice with Poisson.