Aan de slagGa gratis aan de slag

Explore an R corpus

One of your coworkers has prepared a corpus of 20 documents discussing crude oil, named crude. This is only a sample of several thousand articles you will receive next week. In order to get ready for running text analysis on these documents, you have decided to explore their content and metadata. Remember that in R, a VCorpus contains both meta and content regarding each text. In this lesson, you will explore these two objects.

Deze oefening maakt deel uit van de cursus

Introduction to Natural Language Processing in R

Cursus bekijken

Oefeninstructies

  • Print out crude and review the output.
  • Print the content of the 10th article.
  • Print out the ID of the first article in crude.
  • Using the provided for loop, make a vector of the IDs from the corpus.

Praktische interactieve oefening

Probeer deze oefening eens door deze voorbeeldcode in te vullen.

# Print out the corpus
print(___)

# Print the content of the 10th article
crude[[___]]$___

# Find the first ID
crude[[___]]$___$id

# Make a vector of IDs
ids <- c()
for(i in c(1:20)){
  ids <- append(ids, crude[[___]]$___$id)
}
Code bewerken en uitvoeren