Relationship between Binomial and Poisson distributions
You just heard that the Poisson distribution is a limit of the Binomial distribution for rare events. This makes sense if you think about the stories. Say we do a Bernoulli trial every minute for an hour, each with a success probability of 0.1. We would do 60 trials, and the number of successes is Binomially distributed, and we would expect to get about 6 successes. This is just like the Poisson story we discussed in the video, where we get on average 6 hits on a website per hour. So, the Poisson distribution with arrival rate equal to \(np\) approximates a Binomial distribution for \(n\) Bernoulli trials with probability \(p\) of success (with \(n\) large and \(p\) small). Importantly, the Poisson distribution is often simpler to work with because it has only one parameter instead of two for the Binomial distribution.
Let's explore these two distributions computationally. You will compute the mean and standard deviation of samples from a Poisson distribution with an arrival rate of 10. Then, you will compute the mean and standard deviation of samples from a Binomial distribution with parameters \(n\) and \(p\) such that \(np = 10\).
This exercise is part of the course
Statistical Thinking in Python (Part 1)
Exercise instructions
- Using the
rng.poisson()
function, draw10000
samples from a Poisson distribution with a mean of10
. - Make a list of the
n
andp
values to consider for the Binomial distribution. Choosen = [20, 100, 1000]
andp = [0.5, 0.1, 0.01]
so that \(np\) is always 10. - Using
rng.binomial()
inside the providedfor
loop, draw10000
samples from a Binomial distribution with eachn, p
pair and print the mean and standard deviation of the samples. There are 3n, p
pairs:20, 0.5
,100, 0.1
, and1000, 0.01
. These can be accessed inside the loop asn[i], p[i]
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Draw 10,000 samples out of Poisson distribution: samples_poisson
# Print the mean and standard deviation
print('Poisson: ', np.mean(samples_poisson),
np.std(samples_poisson))
# Specify values of n and p to consider for Binomial: n, p
# Draw 10,000 samples for each n,p pair: samples_binomial
for i in range(3):
samples_binomial = ____
# Print results
print('n =', n[i], 'Binom:', np.mean(samples_binomial),
np.std(samples_binomial))