Exercise

# Top terms in movie clusters

Now that you have created a sparse matrix, generate cluster centers and print the top three terms in each cluster. Use the `.todense()`

method to convert the sparse matrix, `tfidf_matrix`

to a normal matrix for the `kmeans()`

function to process. Then, use the `.get_feature_names()`

method to get a list of terms in the `tfidf_vectorizer`

object. The `zip()`

function in Python joins two lists.

The `tfidf_vectorizer`

object and sparse matrix, `tfidf_matrix`

, from the previous have been retained in this exercise. `kmeans`

has been imported from SciPy.

With a higher number of data points, the clusters formed would be defined more clearly. However, this requires some computational power, making it difficult to accomplish in an exercise here.

Instructions

**100 XP**

- Generate cluster centers through the
`kmeans()`

function. - Generate a list of terms from the
`tfidf_vectorizer`

object. - Print top 3 terms of each cluster.