tf-idf vectors for TED talks
In this exercise, you have been given a corpus ted
which contains the transcripts of 500 TED Talks. Your task is to generate the tf-idf vectors for these talks.
In a later lesson, we will use these vectors to generate recommendations of similar talks based on the transcript.
Este ejercicio forma parte del curso
Feature Engineering for NLP in Python
Instrucciones del ejercicio
- Import
TfidfVectorizer
fromsklearn
. - Create a
TfidfVectorizer
object. Name itvectorizer
. - Generate
tfidf_matrix
forted
using thefit_transform()
method.
Ejercicio interactivo práctico
Prueba este ejercicio y completa el código de muestra.
# Import TfidfVectorizer
from ____ import ____
# Create TfidfVectorizer object
____
# Generate matrix of word vectors
tfidf_matrix = vectorizer.____(____)
# Print the shape of tfidf_matrix
print(tfidf_matrix.shape)