1. Learn
  2. /
  3. Courses
  4. /
  5. Natural Language Processing (NLP) in Python

Connected

Exercise

TF-IDF representation of product feedback

You're working with a customer support team at a smart home company. They've collected user feedback on a range of smart devices and want to identify which words stand out in each review. You suggest using the TF-IDF technique to highlight the most relevant terms across feedback entries. Let's help them get started!

A preprocess() function that receives a text and returns a processed one is pre-loaded for you. This function applies lowercasing, tokenization, and punctuation removal. Pandas has been imported as pd, and the TfidfVectorizer class is ready to use.

Instructions

100 XP
  • Initialize a TF-IDF vectorizer.
  • Transform the cleaned reviews into a tfidf_matrix.
  • Create a DataFrame df for the tfidf_matrix, having the vocabulary words as columns.