Get startedGet started for free

EntityRuler with blank spaCy model

EntityRuler lets you to add entities to doc.ents. It can be combined with EntityRecognizer, a spaCy pipeline component for named-entity recognition, to boost accuracy, or used on its own to implement a purely rule-based entity recognition system. In this exercise, you will practice adding an EntityRuler component to a blank spaCy English model and classify named entities of the given text using purely rule-based named-entity recognition.

The spaCy package is already imported and a blank spaCy English model is ready for your use as nlp. A list of patterns to classify lower cased OpenAI and Microsoft as ORG is already created for your use.

This exercise is part of the course

Natural Language Processing with spaCy

View Course

Exercise instructions

  • Create and add an EntityRuler component to the pipeline.
  • Add given patterns to the EntityRuler component.
  • Run the model on the given text and create its corresponding Doc container.
  • Print a tuple of (entities text and types) for all entities in the Doc container

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

nlp = spacy.blank("en")
patterns = [{"label": "ORG", "pattern": [{"LOWER": "openai"}]},
            {"label": "ORG", "pattern": [{"LOWER": "microsoft"}]}]
text = "OpenAI has joined forces with Microsoft."

# Add EntityRuler component to the model
entity_ruler = nlp.____("entity_ruler")

# Add given patterns to the EntityRuler component
entity_ruler.____(____)

# Run the model on a given text
doc = nlp(____)

# Print entities text and type for all entities in the Doc container
print([(ent.____, ent.____) for ent in doc.____])
Edit and Run Code