LoslegenKostenlos loslegen

Step 2: Identify Text Sources

In this short exercise you will load and examine a small corpus of property rental reviews from around Boston. Hopefully you already know read.csv() which enables you to load a comma separated file. This may seem mundane but the point of this chapter is to get you doing an entire workflow from start to finish so let's begin with data ingestion!

Next you simply apply str() to review the data frame's structure. It is a convenient function for compactly displaying initial values and class types for vectors.

Lastly you will apply dim() to print the dimensions of the data frame. For a data frame, your console will print the number of rows and the number of columns.

Other functions like head(), tail() or summary() are often used for data exploration but in this case we keep the examination short so you can get to the fun sentiment analysis!

Diese Übung ist Teil des Kurses

Sentiment Analysis in R

Kurs anzeigen

Anleitung zur Übung

The Boston property rental reviews are stored in a CSV file located by the predefined variable bos_reviews_file.

  • Load the property reviews from bos_reviews_file with read.csv(). Call the object bos_reviews.
  • Examine the structure of the data frame using the base str() function applied to bos_reviews.
  • Find out how many reviews you are working with by calling dim() on the bos_reviews.

Interaktive Übung

Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.

# bos_reviews_file has been pre-defined
bos_reviews_file

# load raw text
bos_reviews <- ___

# Structure
___

# Dimensions
___
Code bearbeiten und ausführen