Session Ready
Exercise

Comparing LDA output

We've only run a single LDA with a specific number of topics. The tidied output from that model, lda_out_tidy, has been loaded along with dtm_twitter in your workspace. Now run LDA with 3 topics and compare the outputs.

> lda_out_tidy

# A tibble: 35,928 x 3
   topic term        beta
   <int> <chr>      <dbl>
 1     1 flight   0.0343 
 2     1 time     0.0102 
 3     2 service  0.00882
 4     1 plane    0.00688
 5     1 trip     0.00614
 6     2 customer 0.00604
 7     1 delayed  0.00596
 8     2 airline  0.00593
 9     1 hours    0.00532
10     1 day      0.00499
# ... with 35,918 more rows
Instructions
100 XP
  • Run an LDA with 3 topics and a Gibbs sampler (this may take 10 or more seconds).
  • Tidy the matrix of word probabilities.
  • Arrange the topics by word probabilities in descending order.