Session Ready
Exercise

Trying out different methods

Perfect, you already have learned about multiple methods of calculating string distances. Which method to use depends on a lot of circumstances, so it's a good idea to play around with the different methods and their parameters a bit to get to know them better. For this exercise you'll use the search term "Marya Carey" - a mistyped version of the name "Mariah Carey". How similar is the mistyped name to the real one with different methods of string distances?

The goal is to find parameters that will yield a low distance on the two names described above while maintaining a large distance to the other names in the list that are not the person one is searching for.

Instructions
100 XP
  • Generate the q-grams for substring length values of 1 and 2.
  • Calculate the string distance between search and names using the q-gram method for substring length values of 1 and 2.
  • Calculate the string distance between search and names by using the "osa" method.