tf-idf vectors for TED talks
In this exercise, you have been given a corpus ted which contains the transcripts of 500 TED Talks. Your task is to generate the tf-idf vectors for these talks.
In a later lesson, we will use these vectors to generate recommendations of similar talks based on the transcript.
Diese Übung ist Teil des Kurses
<Kurs>Feature Engineering for NLP in Python</Kurs>Übungsanweisungen
- Import
TfidfVectorizerfromsklearn. - Create a
TfidfVectorizerobject. Name itvectorizer. - Generate
tfidf_matrixfortedusing thefit_transform()method.
Interaktive praktische Übung
Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.
# Import TfidfVectorizer
from ____ import ____
# Create TfidfVectorizer object
____
# Generate matrix of word vectors
tfidf_matrix = vectorizer.____(____)
# Print the shape of tfidf_matrix
print(tfidf_matrix.shape)