Load some text
Text mining begins with loading some text data into R, which we'll do with the read.csv()
function.
A best practice is to examine the object you read in to make sure you know which column(s) are important. The str()
function provides an efficient way of doing this.
If the data frame contains columns that are not text, you may want to make a new object using only the correct column of text (e.g.,some_object$column_name
).
Be aware that this is real data from Twitter and as such there is always a risk that it may contain profanity or other offensive content (in this exercise, and any following exercises that also use real Twitter data).
This exercise is part of the course
Text Mining with Bag-of-Words in R
Exercise instructions
The data has been loaded for you and is available in coffee_data_file
.
- Create a new object
tweets
usingread.csv()
on the filecoffee_data_file
, which contains tweets mentioning coffee. - Examine the
tweets
object usingstr()
to determine which column has the text you'll want to analyze. - Make a new
coffee_tweets
object using only the text column you identified earlier. To do so, use the$
operator and column name.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Import text data from CSV, no factors
tweets <- ___
# View the structure of tweets
___
# Isolate text from tweets
coffee_tweets <- ___