NMF features of the Wikipedia articles
Now you will explore the NMF features you created in the previous exercise. A solution to the previous exercise has been pre-loaded, so the array nmf_features is available. Also available is a list titles giving the title of each Wikipedia article.
When investigating the features, notice that for both actors, the NMF feature 3 has by far the highest value. This means that both articles are reconstructed using mainly the 3rd NMF component. In the next video, you'll see why: NMF components represent topics (for instance, acting!).
This exercise is part of the course
Unsupervised Learning in Python
Exercise instructions
- Import
pandasaspd. - Create a DataFrame
dffromnmf_featuresusingpd.DataFrame(). Set the index totitlesusingindex=titles. - Use the
.loc[]accessor ofdfto select the row with title'Anne Hathaway', and print the result. These are the NMF features for the article about the actress Anne Hathaway. - Repeat the last step for
'Denzel Washington'(another actor).
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Import pandas
____
# Create a pandas DataFrame: df
df = ____
# Print the row for 'Anne Hathaway'
print(____)
# Print the row for 'Denzel Washington'
print(____)