EntityRuler with blank spaCy model
EntityRuler lets you to add entities to doc.ents. It can be combined with EntityRecognizer, a spaCy pipeline component for named-entity recognition, to boost accuracy, or used on its own to implement a purely rule-based entity recognition system. In this exercise, you will practice adding an EntityRuler component to a blank spaCy English model and classify named entities of the given text using purely rule-based named-entity recognition.
The spaCy package is already imported and a blank spaCy English model is ready for your use as nlp. A list of patterns to classify lower cased OpenAI and Microsoft as ORG is already created for your use.
This exercise is part of the course
Natural Language Processing with spaCy
Exercise instructions
- Create and add an
EntityRulercomponent to the pipeline. - Add given patterns to the
EntityRulercomponent. - Run the model on the given
textand create its correspondingDoccontainer. - Print a tuple of (entities text and types) for all entities in the
Doccontainer
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
nlp = spacy.blank("en")
patterns = [{"label": "ORG", "pattern": [{"LOWER": "openai"}]},
{"label": "ORG", "pattern": [{"LOWER": "microsoft"}]}]
text = "OpenAI has joined forces with Microsoft."
# Add EntityRuler component to the model
entity_ruler = nlp.____("entity_ruler")
# Add given patterns to the EntityRuler component
entity_ruler.____(____)
# Run the model on a given text
doc = nlp(____)
# Print entities text and type for all entities in the Doc container
print([(ent.____, ent.____) for ent in doc.____])