Extracting countries and relationships
In the previous exercise, you wrote a script using spaCy's PhraseMatcher
to find country names in text. Let's use that country matcher on a longer text, analyze the syntax and update the document's entities with the matched countries. The nlp
object has already been created.
The text is available as the variable text
, the PhraseMatcher
with the country patterns as the variable matcher
. The Span
class has already been imported.
Diese Übung ist Teil des Kurses
Advanced NLP with spaCy
Interaktive Übung
Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.
# Create a doc and find matches in it
doc = ____
# Iterate over the matches
for match_id, start, end in matcher(doc):
# Create a Span with the label for "GPE"
span = ____(____, ____, ____, label=____)
# Overwrite the doc.ents and add the span
doc.ents = list(doc.ents) + [____]
# Print the entities in the document
print([(ent.text, ent.label_) for ent in doc.ents if ent.label_ == 'GPE'])