MulaiMulai sekarang secara gratis

Lowercasing

You're analyzing user reviews for a travel website. These reviews often include inconsistent capitalization like "TRAVEL" and "travel". To prepare the text for sentiment analysis and topic extraction, you'll first convert all words to lowercase, then tokenize them and clean them from stop words and punctuation.

The word_tokenize() function, a stop_words list have been provided. NLTK resources are already downloaded.

Latihan ini adalah bagian dari kursus

Natural Language Processing (NLP) in Python

Lihat Kursus

Petunjuk latihan

  • Convert the provided review into lowercase.
  • Tokenize the lower_text into words.
  • Use list comprehension to remove stop words and punctuation using the lists of stop_words and string.punctuation.

Latihan interaktif praktis

Cobalah latihan ini dengan menyelesaikan kode contoh berikut.

review = "I have been FLYING a lot lately and the Flights just keep getting DELAYED. Honestly, traveling for WORK gets exhausting with endless delays, but every trip teaches you something new!"

# Lowercase the review
lower_text = ____

# Tokenize the lower_text into words
tokens = ____

# Remove stop words and punctuation
clean_tokens = [____ if word ____ and word ____]

print(clean_tokens)
Edit dan Jalankan Kode