Components with extensions
Extension attributes are especially powerful if they're combined with custom pipeline components. In this exercise, you'll write a pipeline component that finds country names and a custom extension attribute that returns a country's capital, if available.
The nlp
object has already been created and the Span
class is already imported. A phrase matcher with all countries is available as the variable matcher
. A dictionary of countries mapped to their capital cities is available as the variable capitals
.
This exercise is part of the course
Advanced NLP with spaCy
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
def countries_component(doc):
# Create an entity Span with the label 'GPE' for all matches
doc.ents = [____(____, ____, ____, label=____)
for match_id, start, end in matcher(doc)]
return doc
# Add the component to the pipeline
____.____(____)
print(nlp.pipe_names)