Get startedGet started for free

Customizing tigris options

1. Customizing tigris options

Functions in the tigris package include a variety of options for customizing and facilitating R users' work with Census geographic datasets. In this lesson, you'll learn about some of these options.

2. Cartographic boundary shapefiles

tigris functions return datasets from the TIGER/Line database by default. However, there are times when TIGER/Line files may be unsuitable for R users' tasks. TIGER/Line shapefiles include water area, which can make areas with coastline appear different than expected when mapped. The Census Bureau makes cartographic boundary shapefiles available as an alternative, which represent the US coastline more accurately. For many boundary files in tigris, these cartographic boundary files are available with the argument cb equals TRUE.

3. TIGER/Line vs. cartographic boundary files

The example shown on the slide illustrates the differences between TIGER/Line and cartographic boundary datasets for the US state of Rhode Island, which includes islands and peninsular areas. We're using the plot() function in base R, which can be used to visualize the geometry of features from tigris quickly. The TIGER/Line dataset, where the cb parameter is set to FALSE, includes Rhode Island's water area in the dataset. The cartographic boundary dataset, in contrast, illustrates the detail along Rhode Island's coastline.

4. tigris and simple features

tigris users also have the option to return Census geographic datasets as simple features objects, courtesy of the sf package. This can be set as a global option in R by setting the tigris_class option to "sf". The sf package is the next-generation package for vector-based spatial data in R and represents these geographic datasets much like R data frames. The Arizona counties dataset is shown on the slide as an sf object. In addition to the counties dataset's attributes, the geometry of the counties is stored in a list-column. A major advantage to using sf classes, instead of the default sp, is that sf can speed up load times significantly for large datasets.

5. Caching tigris data

While using sf classes can speed up the loading of Census shapefiles into R, it will not speed up download times, which can be lengthy for large datasets and slower internet connections. To resolve this, tigris offers the option of caching shapefiles on the user's computer, set with options(tigris_use_cache equals TRUE). Once a file is downloaded, tigris will automatically look for the file in the user's cache directory and avoid downloading it again. tigris uses a default cache directory on a user's computer, but this location can be customized with the tigris_cache_dir() function.

6. Historical data and tigris

To ensure integration with tidycensus, which you'll learn about in the final chapter of this course, tigris defaults to the year of the most recently released American Community Survey data. However, Census shapefiles are available in tigris for 1990, 2000, 2010, and 2011 through 2017 by specifying a "year" argument. Boundaries of statistical entities can change quite a bit over time, which we'll explore here for Census tracts in fast-growing Williamson County, Texas to the north of Austin.

7. Historical data and tigris

As we can see in the plot, there were fewer Census tracts in Williamson County in 1990 than in 2016. The Census Bureau re-draws the boundaries of statistical entities with every decennial Census, especially in fast-growing areas. Be aware of this when analyzing data over time!

8. Let's practice!

Now it's your turn! Let's try working with these options in R.

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.