ComeçarComece de graça

Creating a corpus

You have created a tibble called russian_tweets that contains around 20,000 tweets auto generated by bots during the 2016 U.S. election cycle so that you can perform text analysis. However, when searching through the available options for performing the analysis you have chosen to do, you believe that the tm package offers the easiest path forward. In order to conduct the analysis, you first must create a corpus and attach potentially useful metadata.

Be aware that this is real data from Twitter and as such there is always a risk that it may contain profanity or other offensive content (in this exercise, and any following exercises that also use real Twitter data).

Este exercício faz parte do curso

Introduction to Natural Language Processing in R

Ver curso

Instruções do exercício

  • Create a corpus using the content column of russian_tweets.
  • Attach both the following and followers columns as metadata to tweet_corpus.
  • Print the first few rows of the metadata table.

Exercício interativo prático

Experimente este exercício completando este código de exemplo.

# Create a corpus
tweet_corpus <- ___(___(russian_tweets$___))

# Attach following and followers
___(tweet_corpus, 'following') <- russian_tweets$___
___(tweet_corpus, 'followers') <- russian_tweets$___

# Review the meta data
head(meta(___))
Editar e executar o código