Aan de slagGa gratis aan de slag

LDA model

Now it's time to build the LDA model. Using the dictionary and corpus, you are ready to discover which topics are present in the Enron emails. With a quick print of words assigned to the topics, you can do a first exploration about whether there are any obvious topics that jump out. Be mindful that the topic model is heavy to calculate so it will take a while to run. Let's give it a try!

Deze oefening maakt deel uit van de cursus

Fraud Detection in Python

Cursus bekijken

Oefeninstructies

  • Build the LDA model from gensim models, by inserting the corpus and dictionary.
  • Save the 5 topics by running print topics on the model results, and select the top 5 words.

Praktische interactieve oefening

Probeer deze oefening eens door deze voorbeeldcode in te vullen.

# Define the LDA model
ldamodel = gensim.models.____.____(____, num_topics=5, id2word=____, passes=5)

# Save the topics and top 5 words
topics = ____.____(num_words=____)

# Print the results
for topic in topics:
    print(topic)
Code bewerken en uitvoeren