Significance of Difference of Proportions

Bike commuting is still uncommon, but Washington, DC, has a decent share. It has increased by over 1 percentage point in the last few years, but is this a statistically significant increase? In this exercise you will calculate the standard error of a proportion, then a two-sample Z-statistic of the proportions.

The formula for the standard error (SE) of a proportion is:

$S E_{P} = \frac{1}{N} \sqrt{S E_{n}^{2} - P^{2} S E_{N}^{2}}$

The formula for the two-sample Z-statistic is:

$Z = \frac{x_{1} - x_{2}}{\sqrt{S E_{x_{1}}^{2} + S E_{x_{2}}^{2}}}$

The DataFrame dc is loaded. It has columns (shown in the console) with estimates (ending "_est") and margins of error (ending "_moe") for total workers and bike commuters.

The sqrt function has been imported from the numpy module.

Calculate bike_share by dividing the number of bikers by the total number of workers
Calculate the SE of the estimate of bikers and total workers, by dividing the MOE by Z_CRIT
Calculate the SE of the proportions: se_bike is the SE of the subpopulation $S E_{n}$ , bike_share is the proportion $P$ , and se_total is the SE of the population $S E_{N}$
Calculate $Z$ : $x_{1}$ and $x_{2}$ are the bike_share in 2017 and 2011; $S E_{x_{1}}$ and $S E_{x_{2}}$ are se_p in 2017 and 2011

script.py

IPython Shell

Decennial Census of Population and Housing

American Community Survey

Measuring Segregation

Exploring Census Topics

Exercise

Exercise

Significance of Difference of Proportions

Instructions