1. Learn
  2. /
  3. Courses
  4. /
  5. Natural Language Processing with spaCy

Exercise

PhraseMatcher in spaCy

While processing unstructured text, you often have long lists and dictionaries that you want to scan and match in given texts. The Matcher patterns are handcrafted and each token needs to be coded individually. If you have a long list of phrases, Matcher is no longer the best option. In this instance, PhraseMatcher class helps us match long dictionaries. In this exercise, you will practice to retrieve patterns with matching shapes to multiple terms using PhraseMatcher class.

en_core_web_sm model is already loaded and ready for you to use as nlp. PhraseMatcher class is imported. A text string and a list of terms are available for your use.

Instructions

100 XP
  • Initialize a PhraseMatcher class with an attr to match to shape of given terms.
  • Create patterns to add to the PhraseMatcher object.
  • Find matches to the given patterns and print start and end token indices and matching section of the given text.