Aan de slagGa gratis aan de slag

Removing stop words

You're working on a project where the goal is to classify feedback from users into different categories like "product issues", "service issues", and "suggestions". Often, stop words don't carry much meaning in distinguishing between categories. Your task is to remove these stop words to focus on the important words that will help a machine later on categorize the feedback into the correct topics.

The functions word_tokenize from nltk.tokenize and stopwords.words from nltk.corpus have been imported for you. Additionally, the NLTK resources punkt_tab and stopwords have already been downloaded.

Deze oefening maakt deel uit van de cursus

Natural Language Processing (NLP) in Python

Cursus bekijken

Oefeninstructies

  • Tokenize the provided feedback into words.
  • Get the list of English stopwords.
  • Remove English stop words and save the result in filtered_tokens.

Praktische interactieve oefening

Probeer deze oefening eens door deze voorbeeldcode in te vullen.

feedback = "I reached out to support and got a helpful response within minutes!!! Very #impressed"

# Tokenize the text
tokens = word_tokenize(____)

# Get the list of English stop words
stop_words = stopwords.____('____')

# Remove stop words 
filtered_tokens = [____ for word in tokens if ____.lower() not in ____]

print(filtered_tokens)
Code bewerken en uitvoeren