grepl & grep
In their most basic form, regular expressions can be used to see whether a pattern exists inside a character string or a vector of character strings. For this purpose, you can use:
grepl()
, which returnsTRUE
when a pattern is found in the corresponding character string.grep()
, which returns a vector of indices of the character strings that contains the pattern.
Both functions need a pattern
and an x
argument, where pattern
is the regular expression you want to match for, and the x
argument is the character vector from which matches should be sought.
In this and the following exercises, you'll be querying and manipulating a character vector of email addresses! The vector emails
has been pre-defined so you can begin with the instructions straight away!
This is a part of the course
“Intermediate R”
Exercise instructions
- Use
grepl()
to generate a vector of logicals that indicates whether these email addresses contain"edu"
. Print the result to the output. - Do the same thing with
grep()
, but this time save the resulting indexes in a variablehits
. - Use the variable
hits
to select from theemails
vector only the emails that contain"edu"
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# The emails vector has already been defined for you
emails <- c("[email protected]", "[email protected]", "[email protected]",
"invalid.edu", "[email protected]", "[email protected]")
# Use grepl() to match for "edu"
# Use grep() to match for "edu", save result to hits
# Subset emails using hits