Stemming
Now that you've cleaned the review
text and removed stop words and punctuation, you're ready to normalize the remaining words using stemming to reduce words to their root form. This helps group similar words together, making your analysis more consistent and efficient.
The PorterStemmer
class has been provided, along with a list of clean_tokens
.
This exercise is part of the course
Natural Language Processing (NLP) in Python
Exercise instructions
- Initialize the
PorterStemmer()
. - Use a list comprehension to stem each token from the
clean_tokens
list.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
clean_tokens = ['flying', 'lot', 'lately', 'flights', 'keep', 'getting', 'delayed', 'honestly', 'traveling', 'work', 'gets', 'exhausting', 'endless', 'delays', 'every', 'travel', 'teaches', 'something', 'new']
# Create stemmer
stemmer = ____()
# Stem each token
stemmed_tokens = [____.____(____) for ____ in clean_tokens]
print(stemmed_tokens)