Get startedGet started for free

Checking data will match

Forcing your data into the data slot doesn't work because you lose the correct correspondence between rows and spatial objects. How do you add the income data to the polygon data? The merge() function in sp is designed exactly for this purpose.

You might have seen merge() before with data frames. sp::merge() has almost the exact same structure, but you pass it a Spatial*** object and a data frame and it returns a new Spatial*** object where the data slot is now a merge of the original data slot and the data frame. To do this merge, you'll require both the spatial object and data frame to have a column that contains IDs to match on.

Both nyc_tracts and nyc_income have columns with tract IDs, so these are great candidates for merging the two datasets. However, it's always a good idea to check that the proposed IDs are unique and that there is a match for every row in both datasets.

Let's check this before moving on to the merge.

This exercise is part of the course

Visualizing Geospatial Data in R

View Course

Exercise instructions

  • Use any() with duplicated() on nyc_income$tract to check if every row in nyc_income has a unique tract ID.
  • Use any() with duplicated() on nyc_tracts$TRACTCE to check if every row in nyc_tracts has a unique tract ID.
  • Use all() on nyc_tracts$TRACTCE %in% nyc_income$tract to check the nyc_tracts tracts are all in nyc_income.
  • Use all() on nyc_income$tract %in% nyc_tracts$TRACTCE to check the nyc_income tracts are all in nyc_tracts.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Check for duplicates in nyc_income


# Check for duplicates in nyc_tracts


# Check nyc_tracts in nyc_income


# Check nyc_income in nyc_tracts
Edit and Run Code