Step 2: Identify Text Sources
In this short exercise you will load and examine a small corpus of property rental reviews from around Boston. Hopefully you already know read.csv()
which enables you to load a comma separated file. This may seem mundane but the point of this chapter is to get you doing an entire workflow from start to finish so let's begin with data ingestion!
Next you simply apply str()
to review the data frame's str
ucture. It is a convenient function for compactly displaying initial values and class types for vectors.
Lastly you will apply dim()
to print the dim
ensions of the data frame. For a data frame, your console will print the number of rows and the number of columns.
Other functions like head()
, tail()
or summary()
are often used for data exploration but in this case we keep the examination short so you can get to the fun sentiment analysis!
Diese Übung ist Teil des Kurses
Sentiment Analysis in R
Anleitung zur Übung
The Boston property rental reviews are stored in a CSV file located by the predefined variable bos_reviews_file
.
- Load the property reviews from
bos_reviews_file
withread.csv()
. Call the objectbos_reviews
. - Examine the structure of the data frame using the base
str()
function applied tobos_reviews
. - Find out how many reviews you are working with by calling
dim()
on thebos_reviews
.
Interaktive Übung
Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.
# bos_reviews_file has been pre-defined
bos_reviews_file
# load raw text
bos_reviews <- ___
# Structure
___
# Dimensions
___