Inspecting Tf-idf values
After creating Tf-idf features you will often want to understand what are the most highest scored words for each corpus. This can be achieved by isolating the row you want to examine and then sorting the the scores from high to low.
The DataFrame from the last exercise (tv_df) is available in your workspace.
Deze oefening maakt deel uit van de cursus
Feature Engineering for Machine Learning in Python
Oefeninstructies
- Assign the first row of
tv_dftosample_row. sample_rowis now a series of weights assigned to words. Sort these values to print the top 5 highest-rated words.
Praktische interactieve oefening
Probeer deze oefening eens door deze voorbeeldcode in te vullen.
# Isolate the row to be examined
sample_row = tv_df.____
# Print the top 5 words of the sorted output
print(sample_row.____(ascending=____).____())