Step 2: Identify Text Sources
In this short exercise you will load and examine a small corpus of property rental reviews from around Boston. Hopefully you already know read.csv() which enables you to load a comma separated file. This may seem mundane but the point of this chapter is to get you doing an entire workflow from start to finish so let's begin with data ingestion!
Next you simply apply str() to review the data frame's structure. It is a convenient function for compactly displaying initial values and class types for vectors.
Lastly you will apply dim() to print the dimensions of the data frame. For a data frame, your console will print the number of rows and the number of columns.
Other functions like head(), tail() or summary() are often used for data exploration but in this case we keep the examination short so you can get to the fun sentiment analysis!
Este exercício faz parte do curso
Sentiment Analysis in R
Instruções do exercício
The Boston property rental reviews are stored in a CSV file located by the predefined variable bos_reviews_file.
- Load the property reviews from
bos_reviews_filewithread.csv(). Call the objectbos_reviews. - Examine the structure of the data frame using the base
str()function applied tobos_reviews. - Find out how many reviews you are working with by calling
dim()on thebos_reviews.
Exercício interativo prático
Experimente este exercício completando este código de exemplo.
# bos_reviews_file has been pre-defined
bos_reviews_file
# load raw text
bos_reviews <- ___
# Structure
___
# Dimensions
___