Visualizing goodness of fit
The chi-square goodness of fit test compares proportions of each level of a categorical variable to hypothesized values. Before running such a test, it can be helpful to visually compare the distribution in the sample to the hypothesized distribution.
Recall the vendor incoterms in the late_shipments
dataset. You hypothesize that the four values occur with these frequencies in the population of shipments.
CIP
: 0.05DDP
: 0.1EXW
: 0.75FCA
: 0.1
These frequencies are stored in the hypothesized
DataFrame.
The incoterm_counts
DataFrame stores the .value_counts()
of the vendor_inco_term
column.
late_shipments
is available; pandas
and matplotlib.pyplot
are loaded with their standard aliases.
This exercise is part of the course
Hypothesis Testing in Python
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Find the number of rows in late_shipments
n_total = ____
# Print n_total
print(n_total)