Session Ready
Exercise

Combining with stringr functions

You can pass a regular expression as the pattern argument to any stringr function that has the pattern argument. You can use str_detect() to get a logical vector for whether there was a match, str_subset() to return just the strings with matches, and str_count() to count the number of matches in each string.

As a reminder, compare the output of those three functions with our "c_t" pattern from the previous exercise:

x <- c("cat", "coat", "scotland", "tic toc")
pattern <- "c" %R% ANY_CHAR %R% "t"
str_detect(x, pattern)
str_subset(x, pattern)
str_count(x, pattern)

It now also makes sense to add str_extract() to your repertoire. It returns just the part of the string that matched the pattern:

str_extract(x, pattern)

You'll combine your regular expression skills with stringr to ask how often a q is followed by any character in boy names.

It's always a good idea to test your pattern, so this pattern is shown matched with four names. The first two shouldn't have matches (can you explain why?) but the last two should.

Instructions 1/4
undefined XP
  • 1
    • Find the boy_names with the pattern by using str_subset(). Assign the result to names_with_q.
    • Run length() on the result to find out how many there are.
    • 2
      • Find just the part of boy_names that matched with str_extract(). Assign the result to part_with_q.
      • Run table() on the result to find out how many have qu and how many have other patterns.
    • 3
      • Check that there weren't any boy_names that might have had the pattern twice (you would have only found the first match) by using str_count(). Assign the result to count_of_q.
      • Use table() on the result.
    • 4
      • Get a logical vector of whether or not each boy's name contains q* by calling str_detect(). Assign the result to with_q.
      • Count the fraction of boy's names containing q* by calculating the mean() of with_q.