CommencerCommencer gratuitement

Cleaning with qdap

The qdap package offers other text cleaning functions. Each is useful in its own way and is particularly powerful when combined with the others.

  • bracketX(): Remove all text within brackets (e.g. "It's (so) cool" becomes "It's cool")
  • replace_number(): Replace numbers with their word equivalents (e.g. "2" becomes "two")
  • replace_abbreviation(): Replace abbreviations with their full text equivalents (e.g. "Sr" becomes "Senior")
  • replace_contraction(): Convert contractions back to their base words (e.g. "shouldn't" becomes "should not")
  • replace_symbol() Replace common symbols with their word equivalents (e.g. "$" becomes "dollar")

Cet exercice fait partie du cours

Text Mining with Bag-of-Words in R

Afficher le cours

Instructions

Apply the following functions to the text object from the previous exercise:

  • bracketX()
  • replace_number()
  • replace_abbreviation()
  • replace_contraction()
  • replace_symbol()

Exercice interactif pratique

Essayez cet exercice en complétant cet exemple de code.

## text is still loaded in your workspace

# Remove text within brackets
___

# Replace numbers with words
___

# Replace abbreviations
___

# Replace contractions
___

# Replace symbols with words
___
Modifier et exécuter le code