Lexical attributes
In this example, you'll use spaCy's Doc
and Token
objects, and lexical attributes to find percentages in a text. You'll be looking for two subsequent tokens: a number and a percent sign. The English nlp
object has already been created.
This exercise is part of the course
Advanced NLP with spaCy
Exercise instructions
- Use the
like_num
token attribute to check whether a token in thedoc
resembles a number. - Get the token following the current token in the document. The index of the next token in the
doc
istoken.i + 1
. - Check whether the next token's
text
attribute is a percent sign "%".
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Process the text
doc = nlp("In 1990, more than 60% of people in East Asia were in extreme poverty. Now less than 4% are.")
# Iterate over the tokens in the doc
for token in doc:
# Check if the token resembles a number
if ____.____:
# Get the next token in the document
next_token = ____[____]
# Check if the next token's text equals '%'
if next_token.____ == '%':
print('Percentage found:', token.text)