Get startedGet started for free

NMF features of the Wikipedia articles

Now you will explore the NMF features you created in the previous exercise. A solution to the previous exercise has been pre-loaded, so the array nmf_features is available. Also available is a list titles giving the title of each Wikipedia article.

When investigating the features, notice that for both actors, the NMF feature 3 has by far the highest value. This means that both articles are reconstructed using mainly the 3rd NMF component. In the next video, you'll see why: NMF components represent topics (for instance, acting!).

This exercise is part of the course

Unsupervised Learning in Python

View Course

Exercise instructions

  • Import pandas as pd.
  • Create a DataFrame df from nmf_features using pd.DataFrame(). Set the index to titles using index=titles.
  • Use the .loc[] accessor of df to select the row with title 'Anne Hathaway', and print the result. These are the NMF features for the article about the actress Anne Hathaway.
  • Repeat the last step for 'Denzel Washington' (another actor).

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Import pandas
____

# Create a pandas DataFrame: df
df = ____

# Print the row for 'Anne Hathaway'
print(____)

# Print the row for 'Denzel Washington'
print(____)
Edit and Run Code