Creating a custom named entity in spaCy
If spaCy's built-in named entities aren't enough, you can make your own using spaCy's EntityRuler() class.
EntityRuler() allows you to create your own entities to add to a spaCy pipeline.
You start by creating an instance of EntityRuler() and passing it the current pipeline, nlp.
You can then call add_patterns() on the instance and pass it a dictionary of the text pattern you'd like to label with an entity.
Once you've setup a pattern you can add it the nlp pipeline using add_pipe().
Since Acme is a technology company, you decide to tag the pattern "smartphone" with the "PRODUCT" entity tag.
spaCy has been imported and a doc already exists containing the transcribed text from call_4_channel_2.wav file).
Cet exercice fait partie du cours
Spoken Language Processing in Python
Instructions
- Import
EntityRulerfromspacy.pipeline. - Add
"smartphone"as the value for the"pattern"key. - Add the
EntityRuler()instance,ruler, to thenlppipeline. - Print the entity attributes contained in
doc.
Exercice interactif pratique
Essayez cet exercice en complétant cet exemple de code.
# Import EntityRuler class
from spacy.pipeline import ____
# Create EntityRuler instance
ruler = EntityRuler(nlp)
# Define pattern for new entity
ruler.add_patterns([{"label": "PRODUCT", "pattern": ____}])
# Update existing pipeline
nlp.add_pipe(____, before="ner")
# Test new entity
for entity in doc.____:
print(entity.text, entity.label_)