1. Working with margins of error in tidycensus
In contrast with data from the decennial Census, estimates from the American Community Survey are based on an annual sample and subject to a margin of error. In the remainder of this chapter, we'll be going over how to work with and visualize margins of error in the ACS.
2. ACS data vs. census data
As the Census Bureau explains, "While the main function of the decennial census is to provide counts of people for congressional apportionment and legislative redistricting, the primary purpose of the ACS is to measure the changing social and economic characteristics of the U.S. population. As a result, the ACS does not provide official counts of the population in between censuses."
Understanding this distinction is important. Whereas the ACS is very useful at helping analysts understand relative demographic trends, it shouldn't be used to determine precise counts. This is especially important given that count data in the ACS aren't exact figures at all, but rather representative of a range of values that the true value is estimated to lie within.
3. Margins of error in the ACS
A standard get_acs() function call in tidycensus - like the one shown for median age for counties in Oregon - returns an estimate with an associated margin of error for each row. By default, the value in the moe column represents the 90 percent confidence level around the estimate. This means that for Baker County, we are 90 percent sure that the true median age for the county is between 47.8 and 48.6. This still allows us to make comparisons; we are confident that rural Baker County is much older than Benton County, which houses Oregon State University. However, we must take care with treating the estimates as precise values, as that is not what they represent.
4. Inspecting margins of error
In the example of county median age in Oregon, the margins of error did not significantly impact the comparisons we made. However, when working with sparser data at smaller geographies, margins of error can be a much greater issue for analysts. In this example, we are retrieving data on the number of males and females age 75 and up who are living below the poverty line by Census tract in Vermont. As we can see for Addison County, the margin of error sometimes exceeds the estimate! As such, we may want to combine our data to reduce the margins of error.
5. Using margin of error functions
tidycensus includes a series of built-in functions for calculating derived margins of error, using the suggested formulas from the US Census Bureau. These include moe_sum(), for a derived sum; moe_product(), for a derived product; moe_ratio(), for a derived ratio; and moe_prop(), for a derived proportion.
6. Group-wise margins of error
These margin of error functions can be used independently or in tidyverse operations. In the example shown, we generate new columns by combining the estimates for males and females over age 75 in poverty for each Census tract and use the moe_sum() function to generate derived margins of error. While the margins of error are still large in this instance, they are reduced relative to their estimates.
7. Let's practice!
Let's get some practice working with ACS margins of error in tidycensus.