Reviewing LDA results
You have developed a topic model, napoleon_model
, with 5 topics for the sentences from the book Animal Farm that reference the main character Napoleon. You have had 5 local authors review the top words and top sentences for each topic and they have provided you with themes for each topic.
To finalize your results, prepare some summary statistics about the topics. You will present these summary values along with the themes to your boss for review.
Diese Übung ist Teil des Kurses
Introduction to Natural Language Processing in R
Anleitung zur Übung
- Extract the gamma matrix from the topic model,
napoleon_model
. - Use
dplyr
functions to create a tibble of the top topic in each sentence calledgrouped_gammas
. - Use
grouped_gammas
to count the number of sentences most like each topic. - Use
grouped_gammas
and calculate the average gamma value for each topic.
Interaktive Übung
Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.
# Extract the gamma matrix
gamma_values <- tidy(___, matrix = ___)
# Create grouped gamma tibble
grouped_gammas <- gamma_values %>%
___(document) %>%
___(desc(gamma)) %>%
___(1) %>%
___(topic)
# Count (tally) by topic
grouped_gammas %>%
___(topic, sort=TRUE)
# Average topic weight for top topic for each sentence
grouped_gammas %>%
___(avg=mean(gamma)) %>%
___(desc(avg))