IniziaInizia gratis

Lowercasing

You're analyzing user reviews for a travel website. These reviews often include inconsistent capitalization like "TRAVEL" and "travel". To prepare the text for sentiment analysis and topic extraction, you'll first convert all words to lowercase, then tokenize them and clean them from stop words and punctuation.

The word_tokenize() function, a stop_words list have been provided. NLTK resources are already downloaded.

Questo esercizio fa parte del corso

Natural Language Processing (NLP) in Python

Visualizza il corso

Istruzioni dell'esercizio

  • Convert the provided review into lowercase.
  • Tokenize the lower_text into words.
  • Use list comprehension to remove stop words and punctuation using the lists of stop_words and string.punctuation.

Esercizio pratico interattivo

Prova a risolvere questo esercizio completando il codice di esempio.

review = "I have been FLYING a lot lately and the Flights just keep getting DELAYED. Honestly, traveling for WORK gets exhausting with endless delays, but every trip teaches you something new!"

# Lowercase the review
lower_text = ____

# Tokenize the lower_text into words
tokens = ____

# Remove stop words and punctuation
clean_tokens = [____ if word ____ and word ____]

print(clean_tokens)
Modifica ed esegui il codice