Aan de slagGa gratis aan de slag

Extracting countries and relationships

In the previous exercise, you wrote a script using spaCy's PhraseMatcher to find country names in text. Let's use that country matcher on a longer text, analyze the syntax and update the document's entities with the matched countries. The nlp object has already been created.

The text is available as the variable text, the PhraseMatcher with the country patterns as the variable matcher. The Span class has already been imported.

Deze oefening maakt deel uit van de cursus

Advanced NLP with spaCy

Cursus bekijken

Praktische interactieve oefening

Probeer deze oefening eens door deze voorbeeldcode in te vullen.

# Create a doc and find matches in it
doc = ____

# Iterate over the matches
for match_id, start, end in matcher(doc):
    # Create a Span with the label for "GPE"
    span = ____(____, ____, ____, label=____)

    # Overwrite the doc.ents and add the span
    doc.ents = list(doc.ents) + [____]

# Print the entities in the document
print([(ent.text, ent.label_) for ent in doc.ents if ent.label_ == 'GPE'])
Code bewerken en uitvoeren