Get startedGet started for free

Fitting a normal distribution

When working with relatively small data sets you often don't have enough data to make principled inference. However, if you suspect the data follows a normal distribution, it may be reasonable to fit a normal distribution and work with this, rather than with the raw data. In this exercise you will work the same data on Hispanic firefighters which you previously showed was normally distributed at the 5% level. You will fit a normal distribution to it, and use this to find the percentage of these employees we would generally expect to have less than 10 years of experience.

This DataFrame has been loaded for you in salary_df. The packages pandas as pd, NumPy as np, Matplotlib as plt, and the stats package from SciPy have all been loaded for you.

This exercise is part of the course

Foundations of Inference in Python

View Course

Exercise instructions

  • Fit a normal distribution to the Years of Employment column and save the resulting mean and standard deviation.
  • Use this mean and standard deviation in a normal CDF to estimate the percentage of employees with less than ten years of experience.
  • Print out this percentage.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Fit a normal distribution to the data
mu, std = ____

# Compute the percentage of employees with less than 10 years experience
percent = stats.____(____, loc=____, scale=____)

# Print out this percentage
____
Edit and Run Code