grepl & grep
In their most basic form, regular expressions can be used to see whether a pattern exists inside a character string or a vector of character strings. For this purpose, you can use:
grepl(), which returnsTRUEwhen a pattern is found in the corresponding character string.grep(), which returns a vector of indices of the character strings that contains the pattern.
Both functions need a pattern and an x argument, where pattern is the regular expression you want to match for, and the x argument is the character vector from which matches should be sought.
In this and the following exercises, you'll be querying and manipulating a character vector of email addresses! The vector emails has been pre-defined so you can begin with the instructions straight away!
This exercise is part of the course
Intermediate R
Exercise instructions
- Use
grepl()to generate a vector of logicals that indicates whether these email addresses contain"edu". Print the result to the output. - Do the same thing with
grep(), but this time save the resulting indexes in a variablehits. - Use the variable
hitsto select from theemailsvector only the emails that contain"edu".
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# The emails vector has already been defined for you
emails <- c("[email protected]", "[email protected]", "[email protected]",
"invalid.edu", "[email protected]", "[email protected]")
# Use grepl() to match for "edu"
# Use grep() to match for "edu", save result to hits
# Subset emails using hits