EntityRuler with blank spaCy model
EntityRuler
lets you to add entities to doc.ents
. It can be combined with EntityRecognizer
, a spaCy pipeline component for named-entity recognition, to boost accuracy, or used on its own to implement a purely rule-based entity recognition system. In this exercise, you will practice adding an EntityRuler
component to a blank spaCy
English model and classify named entities of the given text
using purely rule-based named-entity recognition.
The spaCy
package is already imported and a blank spaCy
English model is ready for your use as nlp
. A list of patterns
to classify lower cased OpenAI
and Microsoft
as ORG
is already created for your use.
This exercise is part of the course
Natural Language Processing with spaCy
Exercise instructions
- Create and add an
EntityRuler
component to the pipeline. - Add given patterns to the
EntityRuler
component. - Run the model on the given
text
and create its correspondingDoc
container. - Print a tuple of (entities text and types) for all entities in the
Doc
container
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
nlp = spacy.blank("en")
patterns = [{"label": "ORG", "pattern": [{"LOWER": "openai"}]},
{"label": "ORG", "pattern": [{"LOWER": "microsoft"}]}]
text = "OpenAI has joined forces with Microsoft."
# Add EntityRuler component to the model
entity_ruler = nlp.____("entity_ruler")
# Add given patterns to the EntityRuler component
entity_ruler.____(____)
# Run the model on a given text
doc = nlp(____)
# Print entities text and type for all entities in the Doc container
print([(ent.____, ent.____) for ent in doc.____])