Pairs of restaurants
In the last lesson, you cleaned the restaurants dataset to make it ready for building a restaurants recommendation engine. You have a new DataFrame named restaurants_new with new restaurants to train your model on, that's been scraped from a new data source.
You've already cleaned the cuisine_type and city columns using the techniques learned throughout the course. However you saw duplicates with typos in restaurants names that require record linkage instead of joins with restaurants.
In this exercise, you will perform the first step in record linkage and generate possible pairs of rows between restaurants and restaurants_new. Both DataFrames, pandas and recordlinkage are in your environment.
This exercise is part of the course
Cleaning Data in Python
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Create an indexer and object and find possible pairs
indexer = ____
# Block pairing on cuisine_type
indexer.____(____)
# Generate pairs
pairs = indexer.____(____, ____)