1. Learn
  2. /
  3. Courses
  4. /
  5. Sentiment Analysis in Python

Connected

Exercise

Tfidf and a BOW on same data

In this exercise, you will transform the review column of the Amazon product reviews using both a bag-of-words and a tfidf transformation.

Build both vectorizers, specifying only the maximum number of features to be equal to 100. Create DataFrames after the transformation and print the top 5 rows of each.

Be careful how you specify the maximum number of features in the vocabulary. A large vocabulary size can result in your session being disconnected.

Instructions

100 XP
  • Import the BOW and Tfidf vectorizers.
  • Build and fit a BOW and a Tfidf vectorizer from the review column and limit the number of created features to 100.
  • Create DataFrames from the transformed vector representations.