Lemmatization
While continuing your analysis of user reviews, you noticed that stemming sometimes produces non-standard words like "fli" from "flying", which can reduce interpretability. To address this, you'll now use lemmatization, which returns actual words and helps improve the clarity and accuracy of your analysis.
WordNetLemmatizer
has been imported, stop_words
has been defined, and the necessary NLTK resources have been downloaded.
This exercise is part of the course
Natural Language Processing (NLP) in Python
Exercise instructions
- Create an instance
lemmatizer
of theWordNetLemmatizer()
class. - Use the
lemmatizer
to lemmatize thelower_tokens
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
clean_tokens = ['flying', 'lot', 'lately', 'flights', 'keep', 'getting', 'delayed', 'honestly', 'traveling', 'work', 'gets', 'exhausting', 'endless', 'delays', 'every', 'travel', 'teaches', 'something', 'new']
# Create lemmatizer
lemmatizer = ____()
# Lemmatize each token
lemmatized_tokens = [____.____(____) for ____ in clean_tokens]
print(lemmatized_tokens)