LDA practice
You are interested in the common themes surrounding the character Napoleon in your favorite new book, Animal Farm. Napoleon is a Pig who convinces his fellow comrades to overthrow their human leaders. He also eventually becomes the new leader of Animal Farm.
You have extracted all of the sentences that mention Napoleon's name, pig_sentences
, and created tokenized version of these sentences with stop words removed and stemming completed, pig_tokens
. Complete LDA on these sentences and review the top words associated with some of the topics.
Este exercício faz parte do curso
Introduction to Natural Language Processing in R
Instruções do exercício
- Perform LDA on
pig_matrix
while identifying 10 topics. Set a random seed of1111
for reproducibility. - Extract the beta matrix from the results.
- Filter the beta matrix to topic 2 only and arrange the values by decreasing beta values.
- Filter the beta matrix to topic 3 only and arrange the values by decreasing beta values.
Exercício interativo prático
Experimente este exercício completando este código de exemplo.
library(topicmodels)
# Perform Topic Modeling
sentence_lda <-
___(___, k = ___, method = 'Gibbs', control = list(seed = ___))
# Extract the beta matrix
sentence_betas <- ___(sentence_lda, matrix = "___")
# Topic #2
sentence_betas %>%
___(topic == ___) %>%
arrange(-___)
# Topic #3
sentence_betas %>%
___(topic == ___) %>%
arrange(-___)