Subsetting strings based on match
Since detecting strings with a pattern and then subsetting out those strings is such a common operation, stringr provides a function str_subset() that does that in one step.
For example, let's repeat our search for "pepper" in our pizzas using str_subset():
pizzas <- c("cheese", "pepperoni", "sausage and green peppers")
str_subset(pizzas, pattern = fixed("pepper"))
We get a new vector of strings, but it only contains those original strings that contained the pattern.
str_subset() can be easily confused with str_extract(). str_extract() returns a vector of the same length as that of the input vector, but with only the parts of the strings that matched the pattern. This won't be very interesting until we know about regular expressions, so we'll talk more about this in Chapter 3.
For now, you'll repeat part of the last exercise using str_subset() and then find a few other interesting names.
Este exercício faz parte do curso
String Manipulation with stringr in R
Instruções do exercício
- Find the
boy_namesthat contain"zz", usingstr_subset(). - Find the
girl_namesthat contain"zz". - Find the
girl_namesthat contain"U"and save intostarts_U. Since the pattern matching is case sensitive, this will only be names that start with"U". - Feed
starts_Uinto anotherstr_subset()that looks for"z". Combining multiplestr_subset()calls is a way to find more complicated patterns.
Exercício interativo prático
Experimente este exercício completando este código de exemplo.
# Find boy_names that contain "zz"
___
# Find girl_names that contain "zz"
___
# Find girl_names that contain "U"
starts_U <- ___
starts_U
# Find girl_names that contain "U" and "z"
___