LoslegenKostenlos loslegen

EntityRuler with multi-patterns in spaCy

EntityRuler lets you to add entities to doc.ents and boost its named entity recognition performance. In this exercise, you will practice adding an EntityRuler component to an existing nlp pipeline to ensure multiple entities are correctly being classified.

The en_core_web_sm model is already loaded and is available for your use as nlp. You can access an example text in example_text and use nlp and doc to access an spaCy model and Doc container of example_text respectively.

Diese Übung ist Teil des Kurses

Natural Language Processing with spaCy

Kurs anzeigen

Anleitung zur Übung

  • Print a list of tuples of entities text and types in the example_text with the nlp model.
  • Define multiple patterns to match lower cased brother and sisters to PERSON label.
  • Add an EntityRuler component to the nlp pipeline and add the patterns to the EntityRuler.
  • Print a tuple of text and type of entities for the example_text with the nlp model.

Interaktive Übung

Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.

nlp = spacy.load("en_core_web_md")

# Print a list of tuples of entities text and types in the example_text
print("Before EntityRuler: ", [____ for ____ in nlp(____).____], "\n")

# Define pattern to add a label PERSON for lower cased sisters and brother entities
patterns = [{"label": ____, "pattern": [{"lower": ____}]},
            {"label": ____, "pattern": [{"lower": ____}]}]

# Add an EntityRuler component and add the patterns to the ruler
ruler = nlp.____("entity_ruler")
ruler.____(____)

# Print a list of tuples of entities text and types
print("After EntityRuler: ", [(ent.____, ent.____) for ent in nlp(example_text).____])
Code bearbeiten und ausführen