EntityRuler with multi-patterns in spaCy
EntityRuler
lets you to add entities to doc.ents
and boost its named entity recognition performance. In this exercise, you will practice adding an EntityRuler
component to an existing nlp
pipeline to ensure multiple entities are correctly being classified.
The en_core_web_sm
model is already loaded and is available for your use as nlp
. You can access an example text in example_text
and use nlp
and doc
to access an spaCy
model and Doc
container of example_text
respectively.
This exercise is part of the course
Natural Language Processing with spaCy
Exercise instructions
- Print a list of tuples of entities text and types in the
example_text
with thenlp
model. - Define multiple patterns to match lower cased
brother
andsisters
toPERSON
label. - Add an
EntityRuler
component to thenlp
pipeline and add thepatterns
to theEntityRuler
. - Print a tuple of text and type of entities for the
example_text
with thenlp
model.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
nlp = spacy.load("en_core_web_md")
# Print a list of tuples of entities text and types in the example_text
print("Before EntityRuler: ", [____ for ____ in nlp(____).____], "\n")
# Define pattern to add a label PERSON for lower cased sisters and brother entities
patterns = [{"label": ____, "pattern": [{"lower": ____}]},
{"label": ____, "pattern": [{"lower": ____}]}]
# Add an EntityRuler component and add the patterns to the ruler
ruler = nlp.____("entity_ruler")
ruler.____(____)
# Print a list of tuples of entities text and types
print("After EntityRuler: ", [(ent.____, ent.____) for ent in nlp(example_text).____])