Normal distribution
On to the most recognizable and useful distribution of the bunch: the normal or Gaussian distribution. In the slides, we briefly touched on the bell-curve shape and how the normal distribution along with the central limit theorem enables us to perform hypothesis tests.
Similar to the previous exercises, here you'll start by simulating some data and examining the distribution, then dive a little deeper and examine the probability of certain observations taking place.
This exercise is part of the course
Practicing Statistics Interview Questions in Python
Exercise instructions
- Generate the data for the distribution by using the
rvs()
function with size set to 1000; assign it to thedata
variable. - Display a
matplotlib
histogram; examine the shape of the distribution. - Given a standardized normal distribution, what is the probability of an observation greater than 2?
- Looking at our sample, what is the probability of an observation greater than 2?
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Generate normal data
from scipy.stats import norm
data = norm.rvs(size=____)
# Plot distribution
plt.hist(____)
plt.show()
# Compute and print true probability for greater than 2
true_prob = 1 - norm.cdf(____)
print(____)
# Compute and print sample probability for greater than 2
sample_prob = sum(obs > ____ for obs in data) / len(____)
print(____)