Putting it together
During this chapter, you've cleaned up the city
column of zagat
using string similarity, as well as generated and compared pairs of restaurants from zagat
and fodors
. The end is near - all that's left to do is score and select pairs and link the data together, and you'll be able to begin your analysis in no time!
reclin
and dplyr
are loaded and zagat
and fodors
are available.
This exercise is part of the course
Cleaning Data in R
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Create pairs
pair_blocking(zagat, fodors, blocking_var = "city") %>%
# Compare pairs
compare_pairs(by = c("name", "addr"), default_comparator = jaro_winkler()) %>%
# Score pairs
___