Assigning topics to documents
Creating LDA models are useless unless you can interpret and use the results. You have been given the results of running an LDA model, sentence_lda
on a set of sentences, pig_sentences
. You need to explore both the beta
, top words by topic, and the gamma
, top topics per document, matrices to fully understand the results of any LDA analysis.
Given what you know about these two matrices, extract the results for a specific topic and see if the output matches expectations.
This exercise is part of the course
Introduction to Natural Language Processing in R
Exercise instructions
- Create a tibble for both the
beta
andgamma
matrices. - Explore topic 5 by looking at the top words for topic 5 while arranging the results decreasing
beta
values. - Explore topic 5 by seeing which sentences most align with topic 5 while arranging the results by decreasing
gamma
values.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Extract the beta and gamma matrices
sentence_betas <- tidy(sentence_lda, ___ = "___")
sentence_gammas <- tidy(sentence_lda, ___ = "___")
# Explore Topic 5 Betas
___ %>%
___(topic == ___) %>%
arrange(-___)
# Explore Topic 5 Gammas
___ %>%
___(topic == ___) %>%
arrange(-___)