Fitting a normal distribution
When working with relatively small data sets you often don't have enough data to make principled inference. However, if you suspect the data follows a normal distribution, it may be reasonable to fit a normal distribution and work with this, rather than with the raw data. In this exercise you will work the same data on Hispanic firefighters which you previously showed was normally distributed at the 5% level. You will fit a normal distribution to it, and use this to find the percentage of these employees we would generally expect to have less than 10 years of experience.
This DataFrame has been loaded for you in salary_df
. The packages pandas as pd
, NumPy as np
, Matplotlib as plt
, and the stats
package from SciPy have all been loaded for you.
Cet exercice fait partie du cours
Foundations of Inference in Python
Instructions
- Fit a normal distribution to the
Years of Employment
column and save the resulting mean and standard deviation. - Use this mean and standard deviation in a normal CDF to estimate the percentage of employees with less than ten years of experience.
- Print out this percentage.
Exercice interactif pratique
Essayez cet exercice en complétant cet exemple de code.
# Fit a normal distribution to the data
mu, std = ____
# Compute the percentage of employees with less than 10 years experience
percent = stats.____(____, loc=____, scale=____)
# Print out this percentage
____