Do the data follow our story?
You have modeled no-hitters using an Exponential distribution. Create an ECDF of the real data. Overlay the theoretical CDF with the ECDF from the data. This helps you to verify that the Exponential distribution describes the observed data.
It may be helpful to remind yourself of the function you created in the previous course to compute the ECDF, as well as the code you wrote to plot it.
This exercise is part of the course
Statistical Thinking in Python (Part 2)
Exercise instructions
- Compute an ECDF from the actual time between no-hitters (
nohitter_times
). Use theecdf()
function you wrote in the prequel course. - Create a CDF from the theoretical samples you took in the last exercise (
inter_nohitter_time
). - Plot
x_theor
andy_theor
as a line usingplt.plot()
. Then overlay the ECDF of the real datax
andy
as points. To do this, you have to specify the keyword argumentsmarker = '.'
andlinestyle = 'none'
in addition tox
andy
insideplt.plot()
. - Set a 2% margin on the plot.
- Show the plot.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Create an ECDF from real data: x, y
x, y = ____
# Create a CDF from theoretical samples: x_theor, y_theor
x_theor, y_theor = ____
# Overlay the plots
plt.plot(____, ____)
plt.plot(____, ____, marker=____, linestyle=____)
# Margins and axis labels
plt.margins(____)
plt.xlabel('Games between no-hitters')
plt.ylabel('CDF')
# Show the plot
plt.show()