Creating a regex that matches your needs
In this exercise, you're going to replicate what you just saw in the video exercise by extracting the letters "3D"
from the "line"
column from the screens_per_movie
data frame.
For the extract()
function to work correctly, you need to make sure that the following requirement is met: The number of capturing groups in the regular expression regex
must be identical to the length of the vector into
. If that's not the case, you will run into an error.
Can you resolve this issue so "3D"
and that one or more number \\d+
get extracted correctly from the data frame screens_per_movie
?
This exercise is part of the course
Intermediate Regular Expressions in R
Exercise instructions
- Create a regular expression
regex
that has two capturing groups()
. Their contents will be extracted into the new columns. - Make sure you do not remove the original text column.
- Make sure the second captured group gets converted into numbers.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
extract(
screens_per_movie,
line,
into = c("is_3d", "screens"),
# Capture two groups: "3D" and "one or more digits"
regex = "___.*?___$",
# Pass TRUE or FALSE, the original column should not be removed
remove = ___,
# Pass TRUE or FALSE, the result should get converted to numbers
convert = ___
)