1. Margins of Error
The Decennial Census is a complete count, but the ACS is a survey based on a sample. The numbers are *estimates*, and we may want to know the confidence intervals associated with them.
2. Margins of Error
The margin of error is available as a column ending in the letter "M", whereas the estimate columns end in "E". The 90% confidence intervals are reported. This means it is only 10% likely that the total number of occupied housing units in Alabama is more than 11,416 above or below the estimate of 1.8 million.
3. Margins of Error
Requested MOE variables can be loaded into a pandas DataFrame.
4. Margins of Error
When we rename columns, we will use the same base name as the estimate column but add "_moe".
5. Relative Margin of Error
We are often interested in understanding the Margin of Error as a percentage of the estimate, the Relative MOE.
The Relative MOE will be larger for smaller states...
as well as for counties and other, smaller geographies.
6. Margins of Error of Breakdown Columns
The Relative MOE will be largest for variables that represent small proportions of the table universe. For example, the proportion of owner-occupied housing units with no vehicle available *and* a young householder is very small.
Even a state as large as California has only an estimated 11 thousand households in this category, and a Relative MOE of almost 14%. In Wyoming, the MOE is larger than the estimate, yielding an Relative MOE of greater than 100%. This is not uncommon for small breakdowns,
and the problem can be even worse in smaller geographies.
7. Standard Errors
The MOEs are based on Standard Errors. The Standard Errors are not released, and cannot be constructed directly without the household-level data. But they can be reverse engineered from the MOE. The MOE is based on a Z-statistic taken from the normal distribution.
A critical value of 1.645 indicates a 90% confidence interval, which is the confidence interval reported by the Census for all ACS data releases. The Standard Error can be calculated by dividing the MOE by this critical value.
8. Statistically Significant Difference
You can determine whether the year-over-year change in an estimate is statistically significant by calculating the Z-statistic and comparing it to the critical value.
For example, let's look at values from Table B25045 for California. Was there a significant change in the estimate of occupied housing units from 2016 to 2017?
Begin by defining the variables we need for our formula. x1 and x2 are the estimates, taken from the "total" column, for 2017 and 2016 respectively. The int function converts from a series object to an atomic value.
The standard errors are taken from the "total_moe" column, divided by the critical Z value. `float` converts from a series object to an atomic value.
9. Statistically Significant Difference
Then apply the formula. Z is the difference between the estimates, x1 - x2,
10. Statistically Significant Difference
divided by the square root, taken with numpy.sqrt,
11. Statistically Significant Difference
of the sum of squared standard errors, calculated as se_x1 squared plus se_x2 squared.
Finally, we print the comparison of the absolute value of Z to the critical Z value.
The result is True, indicating a statistically significant change.
12. Approximating SE for Derived Estimates
MOEs can be approximated for derived estimates, like the sum of two or more columns.
Let's estimate 65 and older householders without access to a vehicle. Add the owner-occupied and renter columns.
The MOE of this estimate is calculated as the critical Z value times the square root of the sum of the squared MOEs.
13. Approximating SE for Derived Estimates
This is the DataFrame with the new columns.
14. Let's Practice!
Understanding Margins of Error is very important for analyzing ACS data. Let's get some practice with this concept.