Creating a corpus
You have created a tibble called russian_tweets
that contains around 20,000 tweets auto generated by bots during the 2016 U.S. election cycle so that you can perform text analysis. However, when searching through the available options for performing the analysis you have chosen to do, you believe that the tm
package offers the easiest path forward. In order to conduct the analysis, you first must create a corpus and attach potentially useful metadata.
Be aware that this is real data from Twitter and as such there is always a risk that it may contain profanity or other offensive content (in this exercise, and any following exercises that also use real Twitter data).
Cet exercice fait partie du cours
Introduction to Natural Language Processing in R
Instructions
- Create a corpus using the
content
column ofrussian_tweets
. - Attach both the
following
andfollowers
columns as metadata totweet_corpus
. - Print the first few rows of the metadata table.
Exercice interactif pratique
Essayez cet exercice en complétant cet exemple de code.
# Create a corpus
tweet_corpus <- ___(___(russian_tweets$___))
# Attach following and followers
___(tweet_corpus, 'following') <- russian_tweets$___
___(tweet_corpus, 'followers') <- russian_tweets$___
# Review the meta data
head(meta(___))