Get startedGet started for free

Visualizing goodness of fit

The chi-square goodness of fit test compares proportions of each level of a categorical variable to hypothesized values. Before running such a test, it can be helpful to visually compare the distribution in the sample to the hypothesized distribution.

Recall the vendor incoterms in the late_shipments dataset. You hypothesize that the four values occur with these frequencies in the population of shipments.

  • CIP: 0.05
  • DDP: 0.1
  • EXW: 0.75
  • FCA: 0.1

These frequencies are stored in the hypothesized DataFrame.

The incoterm_counts DataFrame stores the .value_counts() of the vendor_inco_term column.

late_shipments is available; pandas and matplotlib.pyplot are loaded with their standard aliases.

This exercise is part of the course

Hypothesis Testing in Python

View Course

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Find the number of rows in late_shipments
n_total = ____

# Print n_total
print(n_total)
Edit and Run Code