1. Learn
  2. /
  3. Courses
  4. /
  5. Sentiment Analysis in Python

Exercise

Specify token sequence length with BOW

We saw in the video that by specifying different length of tokens - what we called n-grams - we can better capture the context, which can be very important.

In this exercise, you will work with a sample of the Amazon product reviews. Your task is to build a BOW vocabulary, using the review column and specify the sequence length of tokens.

Instructions

100 XP
  • Build the vectorizer, specifying the token sequence length to be uni- and bigrams.
  • Fit the vectorizer.
  • Transform the fitted vectorizer.
  • In the DataFrame, make sure to correctly specify the column names.