Aan de slagGa gratis aan de slag

Build a corpus and convert to lowercase

A corpus is a list of text documents. You have to convert the tweet text into a corpus to facilitate subsequent steps in text processing.

When analyzing text, you want to ensure that a word is not counted as two different words because the case is different in the two instances. Hence, you need to convert text to lowercase.

In this exercise, you will create a text corpus and convert all characters to lower case.

The cleaned text output from the previous exercise has been pre-loaded as twts_gsub.

The library tm has been pre-loaded for this exercise.

Deze oefening maakt deel uit van de cursus

Analyzing Social Media Data in R

Cursus bekijken

Praktische interactieve oefening

Probeer deze oefening eens door deze voorbeeldcode in te vullen.

# Convert text in "twt_gsub" dataset to a text corpus and view output
twt_corpus <- twt_gsub %>% 
                ___() %>% 
                ___() 
head(twt_corpus$___)
Code bewerken en uitvoeren