Aan de slagGa gratis aan de slag

Pairs of restaurants

In the last lesson, you cleaned the restaurants dataset to make it ready for building a restaurants recommendation engine. You have a new DataFrame named restaurants_new with new restaurants to train your model on, that's been scraped from a new data source.

You've already cleaned the cuisine_type and city columns using the techniques learned throughout the course. However you saw duplicates with typos in restaurants names that require record linkage instead of joins with restaurants.

In this exercise, you will perform the first step in record linkage and generate possible pairs of rows between restaurants and restaurants_new. Both DataFrames, pandas and recordlinkage are in your environment.

Deze oefening maakt deel uit van de cursus

Cleaning Data in Python

Cursus bekijken

Praktische interactieve oefening

Probeer deze oefening eens door deze voorbeeldcode in te vullen.

# Create an indexer and object and find possible pairs
indexer = ____

# Block pairing on cuisine_type
indexer.____(____)

# Generate pairs
pairs = indexer.____(____, ____)
Code bewerken en uitvoeren