EntityRuler for NER
EntityRuler
can be combined with EntityRecognizer
of an existing model to boost its accuracy. In this exercise, you will practice combining an EntityRuler
component and an existing NER
component of the en_core_web_sm
model. The model is already loaded as nlp
.
When EntityRuler
is added before NER
component, the entity recognizer will respect the existing entity spans and adjust its predictions based on patterns added to the EntityRuler
to improve accuracy of named entity recognition task.
This exercise is part of the course
Natural Language Processing with spaCy
Exercise instructions
- Add an
EntityRuler
to thenlp
beforener
component. - Define a token entity pattern to classify lower cased
new york group
asORG
. - Add the
patterns
to theEntityRuler
component. - Run the model and print the tuple of entities text and type for the
Doc
container.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
nlp = spacy.load("en_core_web_sm")
text = "New York Group was built in 1987."
# Add an EntityRuler to the nlp before NER component
ruler = nlp.____("entity_ruler", ____="ner")
# Define a pattern to classify lower cased new york group as ORG
patterns = [{"label": "ORG", "pattern": [{"lower": ____}]}]
# Add the patterns to the EntityRuler component
ruler.____(____)
# Run the model and print entities text and type for all the entities
doc = ____
print([(ent.____, ent.____) for ent in doc.____])