Digits, words and spaces
So far in your life you might have always searched for one number or word exactly. Now you have a much more flexible tool at hand, to search for:
\\d
digits (zero to nine)\\w
word characters (letters, numbers or underscores)\\s
white spaces (also tabs and line breaks)
Plus, you can use square brackets [A-Za-z]
and have a list of possible values inside.
You already found all sequels of "Saw"
. Can you create a pattern that matches all sequels in the list movie_titles
? They usually have a number at the end, right?
Furthermore, the list contains duplicates introduced by "Grey"
(British) and "Gray"
(American English). Create a pattern that matches both versions of the color.
Lastly, list out all movie titles that contain special, non word characters.
This exercise is part of the course
Intermediate Regular Expressions in R
Exercise instructions
- Match all movies titles that end with a space followed by a digit.
- Match both
"Grey"
and"Gray"
with a custom pattern[…]
. - Write a pattern that matches everything but word characters
\\w
and spaces\\s
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# List all movies that end with a space and a digit
movie_titles[str_detect(movie_titles,
pattern = "___"
)]
# List all movies that contain "Grey" or "Gray"
movie_titles[str_detect(movie_titles,
pattern = "Gr___y"
)]
# List all movies with strange characters (no word or space)
movie_titles[str_detect(movie_titles,
pattern = "[___]"
)]