grepl & grep (2)
You can use the caret, ^
, and the dollar sign, $
to match the content located in the start and end of a string, respectively. This could take us one step closer to a correct pattern for matching only the ".edu" email addresses from our list of emails. But there's more that can be added to make the pattern more robust:
@
, because a valid email must contain an at-sign..*
, which matches any character (.) zero or more times (*). Both the dot and the asterisk are metacharacters. You can use them to match any character between the at-sign and the ".edu" portion of an email address.\\.edu$
, to match the ".edu" part of the email at the end of the string. The\\
part escapes the dot: it tells R that you want to use the.
as an actual character.
This is a part of the course
“Intermediate R”
Exercise instructions
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# The emails vector has already been defined for you
emails <- c("[email protected]", "[email protected]", "[email protected]",
"invalid.edu", "[email protected]", "[email protected]")
# Use grepl() to match for .edu addresses more robustly
# Use grep() to match for .edu addresses more robustly, save result to hits
# Subset emails using hits