LoslegenKostenlos loslegen

Edit distances vs. q-gram methods

The results of various string distance methods can vary a lot. The numbers might be below one for one method and above 10 for another. That's why it's useful to know the inner workings of each method.

You have seen six methods, three of which work with an "edit distance" approach where they measure the number of edits needed to convert the first string into the second.

The other three work differently: they divide a string into substrings of a certain length - so called q-grams (sometimes also referred to as n-grams). Do you remember which of the methods these were?

Diese Übung ist Teil des Kurses

Intermediate Regular Expressions in R

Kurs anzeigen

Interaktive Übung

Setze die Theorie in einer unserer interaktiven Übungen in die Praxis um

Übung starten