Aan de slagGa gratis aan de slag

Cleaning with qdap

The qdap package offers other text cleaning functions. Each is useful in its own way and is particularly powerful when combined with the others.

  • bracketX(): Remove all text within brackets (e.g. "It's (so) cool" becomes "It's cool")
  • replace_number(): Replace numbers with their word equivalents (e.g. "2" becomes "two")
  • replace_abbreviation(): Replace abbreviations with their full text equivalents (e.g. "Sr" becomes "Senior")
  • replace_contraction(): Convert contractions back to their base words (e.g. "shouldn't" becomes "should not")
  • replace_symbol() Replace common symbols with their word equivalents (e.g. "$" becomes "dollar")

Deze oefening maakt deel uit van de cursus

Text Mining with Bag-of-Words in R

Cursus bekijken

Oefeninstructies

Apply the following functions to the text object from the previous exercise:

  • bracketX()
  • replace_number()
  • replace_abbreviation()
  • replace_contraction()
  • replace_symbol()

Praktische interactieve oefening

Probeer deze oefening eens door deze voorbeeldcode in te vullen.

## text is still loaded in your workspace

# Remove text within brackets
___

# Replace numbers with words
___

# Replace abbreviations
___

# Replace contractions
___

# Replace symbols with words
___
Code bewerken en uitvoeren