1. Learn
  2. /
  3. Courses
  4. /
  5. Natural Language Processing (NLP) in Python

Connected

Exercise

Building vocabulary from customer reviews

You're part of a product analytics team at TechZone, a consumer electronics company. You've received a small batch of customer reviews for a new gadget. To analyze the reviews, you'll first preprocess the text and build a vocabulary, a list of unique words that defines the features used to represent each review as numerical data.

A preprocess() function is pre-loaded for you. It lowercases the text, tokenizes it, and removes punctuation.

Instructions

100 XP
  • Preprocess each review in the dataset using the preprocess() function.
  • Fit the vectorizer on the preprocessed reviews.
  • Print the resulting vocabulary.