Session Ready
Exercise

Significance of Difference of Proportions

Bike commuting is still uncommon, but Washington, DC, has a decent share. It has increased by over 1 percentage point in the last few years, but is this a statistically significant increase? In this exercise you will calculate the standard error of a proportion, then a two-sample Z-statistic of the proportions.

The formula for the standard error (SE) of a proportion is:

$$SE_P = \frac{1}{N}\sqrt{SE_n^2 - P^2SE_N^2}$$

The formula for the two-sample Z-statistic is:

$$Z = \frac{x_1 - x_2}{\sqrt{SE_{x_1}^2 + SE_{x_2}^2}}$$

The data frame dc is loaded. It has columns (shown in the console) with estimates (ending "_est") and margins of error (ending "_moe") for total workers and bike commuters.

The sqrt function has been imported from the numpy module.

Instructions
100 XP
  • Calculate bike_share by dividing the number of bikers by the total number of workers
  • Calculate the SE of the estimate of bikers and total workers, by dividing the MOE by Z_CRIT
  • Calculate the SE of the proportions: se_bike is the SE of the subpopulation \(SE_n\), bike_share is the proportion \(P\), and se_total is the SE of the population \(SE_N\)
  • Calculate \(Z\): \(x_1\) and \(x_2\) are the bike_share in 2017 and 2011; \(SE_{x_1}\) and \(SE_{x_2}\) are se_p in 2017 and 2011