BaşlayınÜcretsiz Başlayın

Testing perplexity

You have been given a dataset full of tweets that were sent by tweet bots during the 2016 US election. Your boss has identified two different account types of interest, Left and Right. Your boss has asked you to perform topic modeling on the tweets from Right tweet bots. Furthermore, your boss is hoping to summarize the content of these tweets with topic modeling. Perform topic modeling on 5, 15, and 50 topics to determine a general idea of how many topics are contained in the data.

Bu egzersiz

Introduction to Natural Language Processing in R

kursunun bir parçasıdır
Kursu Görüntüle

Uygulamalı interaktif egzersiz

Bu örnek kodu tamamlayarak bu egzersizi bitirin.

library(topicmodels)
# Setup train and test data
sample_size <- floor(0.90 * nrow(right_matrix))
set.seed(1111)
train_ind <- sample(nrow(right_matrix), size = sample_size)
train <- right_matrix[train_ind, ]
test <- right_matrix[-train_ind, ]

# Peform topic modeling 
lda_model <- LDA(___, k = ___, method = ___,
                 control = list(seed = 1111))
# Train
___(lda_model, newdata = ___) 
# Test
___(lda_model, newdata = ___) 
Kodu Düzenle ve Çalıştır