1. Learn
  2. /
  3. Courses
  4. /
  5. Foundations of Inference in Python

Exercise

Fitting a normal distribution

When working with relatively small data sets you often don't have enough data to make principled inference. However, if you suspect the data follows a normal distribution, it may be reasonable to fit a normal distribution and work with this, rather than with the raw data. In this exercise you will work the same data on Hispanic firefighters which you previously showed was normally distributed at the 5% level. You will fit a normal distribution to it, and use this to find the percentage of these employees we would generally expect to have less than 10 years of experience.

This DataFrame has been loaded for you in salary_df. The packages pandas as pd, NumPy as np, Matplotlib as plt, and the stats package from SciPy have all been loaded for you.

Instructions

100 XP
  • Fit a normal distribution to the Years of Employment column and save the resulting mean and standard deviation.
  • Use this mean and standard deviation in a normal CDF to estimate the percentage of employees with less than ten years of experience.
  • Print out this percentage.