Debugging patterns (2)
Both patterns in this exercise contain mistakes and won't match as expected. Can you fix them?
The nlp
and a doc
have already been created for you. If you get stuck, try printing the tokens in the doc
to see how the text will be split and adjust the pattern so that each dictionary represents one token.
This exercise is part of the course
Advanced NLP with spaCy
Exercise instructions
- Edit
pattern1
so that it correctly matches all case-insensitive mentions of"Amazon"
plus a title-cased proper noun. - Edit
pattern2
so that it correctly matches all case-insensitive mentions of"ad-free"
, plus the following noun.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Create the match patterns
pattern1 = [{'LOWER': 'Amazon'}, {'IS_TITLE': True, 'POS': 'PROPN'}]
pattern2 = [{'LOWER': 'ad-free'}, {'POS': 'NOUN'}]
# Initialize the Matcher and add the patterns
matcher = Matcher(nlp.vocab)
matcher.add('PATTERN1', None, pattern1)
matcher.add('PATTERN2', None, pattern2)
# Iterate over the matches
for match_id, start, end in matcher(doc):
# Print pattern string name and text of matched span
print(doc.vocab.strings[match_id], doc[start:end].text)