IniziaInizia gratis

Fixing typos with string distance

In this chapter, one of the datasets you'll be working with, zagat, is a set of restaurants in New York, Los Angeles, Atlanta, San Francisco, and Las Vegas. The data is from Zagat, a company that collects restaurant reviews, and includes the restaurant names, addresses, phone numbers, as well as other restaurant information.

The city column contains the name of the city that the restaurant is located in. However, there are a number of typos throughout the column. Your task is to map each city to one of the five correctly-spelled cities contained in the cities data frame.

dplyr and fuzzyjoin are loaded, and zagat and cities are available.

Questo esercizio fa parte del corso

Cleaning Data in R

Visualizza il corso

Esercizio pratico interattivo

Prova a risolvere questo esercizio completando il codice di esempio.

# Count the number of each city variation
zagat %>%
  count(___)
Modifica ed esegui il codice