Session Ready
Exercise

Finding the best value for k

You are given object dtm with the document-term matrix you generated in the previous exercise. You also have a user-defined function p(dtm=___, k=___) that will fit an LDA topic model on matrix dtm for the number of topics k and will return the perplexity score of the model. Here is an example of calling the function for k=3: p(dtm=dtm, k=3).

Run the function for values of k equal to 5, 6, 7, 8, 9, and 10. Take note of the perplexity values that you receive.

Based on perplexity scores, is k=9 better than k=5?

Instructions
50 XP
Possible Answers