Replacing with regular expressions
Now, you've mastered matching with backreferences, you'll build up to replacing with backreferences, but first let's review str_replace()
now that you've got regular expressions under your belt.
Remember str_replace()
takes three arguments, string
a vector of strings to do the replacements in, pattern
that identifies the parts of strings to replace and replacement
the thing to use as a replacement.
replacement
can be a vector, the same length as string
, each element specifies the replacement to be used in each string. Let's practice by anonymizing some of the contact objects you've seen so far.
This is a part of the course
“String Manipulation with stringr in R”
Exercise instructions
Text containing phone numbers has been pre-defined in a variable named contact
.
- Replace a digit in
contact
with"X"
usingstr_replace()
. - Replace all digits in
contact
with"X"
usingstr_replace_all()
. (str_replace()
will only replace the first match to thepattern
.str_replace_all()
will replace all matches to the pattern.) - Replace all digits in
contact
usingstr_replace_all()
, but now specify the vectorc("X", ".", "*", "_")
asreplacement
. Notice how now each string uses a different replacement character.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# View text containing phone numbers
contact
# Replace digits with "X"
str_replace(contact, DGT, ___)
# Replace all digits with "X"
str_replace_all(contact, DGT, ___)
# Replace all digits with different symbol
str_replace_all(contact, DGT, ___)
This exercise is part of the course
String Manipulation with stringr in R
Learn how to pull character strings apart, put them back together and use the stringr package.
Now for two advanced ways to use regular expressions along with stringr: selecting parts of a match (a.k.a capturing) and referring back to parts of a match (a.k.a back-referencing). You'll also learn to deal with and strings or patterns that contain Unicode characters (e.g. é).
Exercise 1: CapturingExercise 2: Capturing parts of a patternExercise 3: Pulling out parts of a phone numberExercise 4: Extracting age and gender againExercise 5: BackreferencesExercise 6: Using backreferences in patternsExercise 7: Replacing with regular expressionsExercise 8: Replacing with backreferencesExercise 9: Unicode and pattern matchingExercise 10: Matching a specific code point or code groupsExercise 11: Matching a single graphemeWhat is DataCamp?
Learn the data skills you need online at your own pace—from non-coding essentials to data science and machine learning.