Assigning topics to documents
Creating LDA models are useless unless you can interpret and use the results. You have been given the results of running an LDA model, sentence_lda
on a set of sentences, pig_sentences
. You need to explore both the beta
, top words by topic, and the gamma
, top topics per document, matrices to fully understand the results of any LDA analysis.
Given what you know about these two matrices, extract the results for a specific topic and see if the output matches expectations.
Diese Übung ist Teil des Kurses
Introduction to Natural Language Processing in R
Anleitung zur Übung
- Create a tibble for both the
beta
andgamma
matrices. - Explore topic 5 by looking at the top words for topic 5 while arranging the results decreasing
beta
values. - Explore topic 5 by seeing which sentences most align with topic 5 while arranging the results by decreasing
gamma
values.
Interaktive Übung
Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.
# Extract the beta and gamma matrices
sentence_betas <- tidy(sentence_lda, ___ = "___")
sentence_gammas <- tidy(sentence_lda, ___ = "___")
# Explore Topic 5 Betas
___ %>%
___(topic == ___) %>%
arrange(-___)
# Explore Topic 5 Gammas
___ %>%
___(topic == ___) %>%
arrange(-___)