Get Started

Subsetting strings based on match

Since detecting strings with a pattern and then subsetting out those strings is such a common operation, stringr provides a function str_subset() that does that in one step.

For example, let's repeat our search for "pepper" in our pizzas using str_subset():

pizzas <- c("cheese", "pepperoni", "sausage and green peppers")
str_subset(pizzas, pattern = fixed("pepper"))

We get a new vector of strings, but it only contains those original strings that contained the pattern.

str_subset() can be easily confused with str_extract(). str_extract() returns a vector of the same length as that of the input vector, but with only the parts of the strings that matched the pattern. This won't be very interesting until we know about regular expressions, so we'll talk more about this in Chapter 3.

For now, you'll repeat part of the last exercise using str_subset() and then find a few other interesting names.

This is a part of the course

“String Manipulation with stringr in R”

View Course

Exercise instructions

  • Find the boy_names that contain "zz", using str_subset().
  • Find the girl_names that contain "zz".
  • Find the girl_names that contain "U" and save into starts_U. Since the pattern matching is case sensitive, this will only be names that start with "U".
  • Feed starts_U into another str_subset() that looks for "z". Combining multiple str_subset() calls is a way to find more complicated patterns.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Find boy_names that contain "zz"
___

# Find girl_names that contain "zz"
___

# Find girl_names that contain "U"
starts_U <- ___
starts_U

# Find girl_names that contain "U" and "z"
___

This exercise is part of the course

String Manipulation with stringr in R

IntermediateSkill Level
4.4+
9 reviews

Learn how to pull character strings apart, put them back together and use the stringr package.

Time to meet stringr! You'll start by learning about some stringr functions that are very similar to some base R functions, then how to detect specific patterns in strings, how to split strings apart and how to find and replace parts of strings.

Exercise 1: Introducing stringrExercise 2: Putting strings together with stringrExercise 3: String lengthExercise 4: Extracting substringsExercise 5: Hunting for matchesExercise 6: Detecting matchesExercise 7: Subsetting strings based on match
Exercise 8: Counting matchesExercise 9: Splitting stringsExercise 10: Parsing strings into variablesExercise 11: Some simple text statisticsExercise 12: Replacing matches in stringsExercise 13: Replacing to tidy stringsExercise 14: ReviewExercise 15: Final challenges

What is DataCamp?

Learn the data skills you need online at your own pace—from non-coding essentials to data science and machine learning.

Start Learning for Free