BaşlayınÜcretsiz Başlayın

BoW model for movie taglines

In this exercise, you have been provided with a corpus of more than 7000 movie tag lines. Your job is to generate the bag of words representation bow_matrix for these taglines. For this exercise, we will ignore the text preprocessing step and generate bow_matrix directly.

We will also investigate the shape of the resultant bow_matrix. The first five taglines in corpus have been printed to the console for you to examine.

Bu egzersiz

Feature Engineering for NLP in Python

kursunun bir parçasıdır
Kursu Görüntüle

Egzersiz talimatları

  • Import the CountVectorizer class from sklearn.
  • Instantiate a CountVectorizer object. Name it vectorizer.
  • Using fit_transform(), generate bow_matrix for corpus.

Uygulamalı interaktif egzersiz

Bu örnek kodu tamamlayarak bu egzersizi bitirin.

# Import CountVectorizer
from sklearn.feature_extraction.text import ____

# Create CountVectorizer object
____ = ____

# Generate matrix of word vectors
bow_matrix = vectorizer.____(____)

# Print the shape of bow_matrix
print(bow_matrix.shape)
Kodu Düzenle ve Çalıştır