Get startedGet started for free

Test of two proportions

You may wonder if the amount paid for freight affects whether or not the shipment was late. Recall that in the late_shipments dataset, whether or not the shipment was late is stored in the late column. Freight costs are stored in the freight_cost_group column, and the categories are "expensive" and "reasonable".

The hypotheses to test, with "late" corresponding to the proportion of late shipments for that group, are

\(H_{0}\): \(late_{\text{expensive}} - late_{\text{reasonable}} = 0\)

\(H_{A}\): \(late_{\text{expensive}} - late_{\text{reasonable}} > 0\)

p_hats contains the estimates of population proportions (sample proportions) for each freight_cost_group:

freight_cost_group  late
expensive           Yes     0.082569
reasonable          Yes     0.035165
Name: late, dtype: float64

ns contains the sample sizes for these groups:

freight_cost_group
expensive     545
reasonable    455
Name: late, dtype: int64

pandas and numpy have been imported under their usual aliases, and norm is available from scipy.stats.

This exercise is part of the course

Hypothesis Testing in Python

View Course

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Calculate the pooled estimate of the population proportion
p_hat = ____

# Print the result
print(p_hat)
Edit and Run Code