Aan de slagGa gratis aan de slag

Text vectorization

You'll now transform the desc column in the UFO dataset into tf/idf vectors, since there's likely something we can learn from this field.

Deze oefening maakt deel uit van de cursus

Preprocessing for Machine Learning in Python

Cursus bekijken

Oefeninstructies

  • Print out the .head() of the desc column.
  • Instantiate a TfidfVectorizer() object.
  • Fit and transform the desc column using vec.
  • Print out the .shape of the desc_tfidf vector, to take a look at the number of columns this created.

Praktische interactieve oefening

Probeer deze oefening eens door deze voorbeeldcode in te vullen.

# Take a look at the head of the desc field
print(____)

# Instantiate the tfidf vectorizer object
vec = ____

# Fit and transform desc using vec
desc_tfidf = vec.____

# Look at the number of columns and rows
print(____.shape)
Code bewerken en uitvoeren